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Foreword 


The Indian Academy of Sciences launched Resonance as a monthly journal 
devoted to science education in January 1996. Resonance is aimed largely at under¬ 
graduate students and teachers of science, though material of interest to somewhat 
younger students is also included. Each issue contains papers that span a wide area 
of science and mathematics, in various formats. Some are individual general arti¬ 
cles, others consist of series with several parts. An effort is made to ensure good 
expository quality in all of them. 

“Echoes from Resonance” is a series of books bom out of Resonance, by putting 
together in a coherent manner a collection of articles (both series and single pieces) 
taken from Resonance, all written around a common theme. Typically, the individual 
articles would have appeared quite independently at different times. These collec¬ 
tions should prove useful to a reader who is keen to learn about a specific subject, 
with accounts given by different authors from different perspectives, but all in an 
expository manner. We hope these volumes would be useful for students and teachers 
alike, and that they will complement the structure of individual issues of Resonance 
which cover different areas of science and mathematics in a balanced manner. 

N. Mukunda 


Vll 








Preface 


Number theory has been a subject of study by mathematicians from the most ancient 
of times. In the Plimpton 322 clay artefact, excavated from the ruins of ancient Baby¬ 
lon, one finds a systematic listing of a large number of Pythagorean triples—triples 
(a, b, c) of positive integers such that they appear to be listed in order 

of increasing c/a ratio. (One sees in the table the beginnings of trigonometry.) The 
Greeks had a deep interest in number theory. Euclid’s great text. The Elements, 
generally considered as a book only on Geometry, actually contains a fair amount 
of number theory too; in particular it contains the proofs of two gems discovered 
by the Greeks-the irrationality of V2 and the infinitude of the primes. It also con¬ 
tains a description of the algorithm now known as the Euclidean algorithm, which 
computes the greatest common divisor of two given numbers. In ancient India too 
there was much interest in number theory, particularly in Diophantine equations; for 
instance, in the linear two-variable equation ax + by = c, where a, b, c are given 
integers, and in the equation later to be known as the Pell equation - Ny^ = 1, 
where iV is a given positive integer). Building on the work of Brahmagupta (6th cen¬ 
tury) Bhaskara II (12th century) gave a completely general way of solving the latter 
equation. 

In this book we offer the reader some articles in number theory that appeared 
in Resonance over the years 1996-2001. Traditionally, number theory begins with 
a study of congruences (Wilson’s and Fermat’s theorems, the Chinese remainder 
theorem, quadratic residues, primitive roots, ...), then proceeds to a study of prirhe 
numbers (the infinitude of various classes of primes, divergence of the sum 2 Vp 
taken over all primes, ...) and later to a study of Diophantine equations (solution 
of equations such as -f = z}, ax by = c, where a, b, c are given integers. 
Pell’s equation ...). The last two topics (prime numbers, Diophantine equations) 
are distinguished by the extraordinary diversity in terms of level of difficulty, of 
the problems they offer to the students. There is something in number theory for 
practically everyone! 

The articles included within form a varied lot with the first half (articles 1 to 8) 
being of an elementary nature. We begin with a short essay on the axiomatic approach 
in modem mathematics: on how conventions sometimes need to be followed for the 
sake of preserving uniformity and maintaining mathematical harmony. The next two 
of the articles deal with elementary problems; “Find four positive integers such that 
the sum of any two is a square”, and Bachet’s problem (“100 kg with five stones”), 
solved using generating functions. There is a piece on mathematical induction, one 
of the very trustworthy and important techniques in the toolkit of any mathematician, 
particularly the number theorist and combinatorist. The following article describes 
Euler’s proof of the infinitude of primes, which establishes rather more than Eulid’s 
well-known proof of the same result. Then there is a short piece on Fermat’s two- 
square theorem, elaborating on a “crisp and elegant proof” by Zagier of the theorem 


IX 
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that a prime of the form 1 (mod 4) is a sum of two squares. The article also suggests 
an algorithm approach towards proving the theorem. In the following article, “Fer¬ 
mat’s Two Squares Theorem Revisited”, Bhaskar Bagchi proves the correctness of 
the algorithm. Following this is a report on recent work done on the factorization of 
Fermat numbers defined by Fn = 2^^ + 1. (Fermat had conjectured, perhaps rather 
rashly, that the numbers Fn are all prime. Now it appears that for > 4 they may 
never be prime!) 

Articles 9-16 are of more substantive nature beginning with a two-part article 
(articles 9 and 10) on the class number problem (“Binary Quadratic Forms” and 
“Algebraic Number Theory”), a topic dealt with for the first time and in considerable 
detail by Gauss in his path-breaking book Disquisitiones Arithemeticae. The two arti¬ 
cles which follow—“Roots are not contained in cyclotomic fields” and “Die Ganzen 
Zahlen hat Gott gemacht, alles andere ist Menschenwerk”—deal with cyclotomic 
polynomials and cyclotomic fields giving interesting applications of ideas introduced 
in the previous two-part article. A proof of a beautiful relation between prime rep¬ 
resenting quadratic equations and class numbers is the subject of the next article. 
We then have an article on congruent numbers, dealing with a problem dating from 
ancient times but which has intimate connections with a very modem topic - that of 
elliptic curves. This is followed by an expository article on one of the great math¬ 
ematical achievements of the 20th century—the proof of “Fermat’s Last Theorem” 
by Andrew Wiles. To top off the collection we have brief survey of some currently 
unsolved problems in number theory. (In passing, we remark briefly that references 
to Fermat appear surprisingly many times in this collection!) 

We hope that the reader will enjoy this varied collection. 


Shailesh Shirali 
C S Yogananda 
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On Provability versus Consistency in 
Elementary Mathematics 


Shailesh A Shirali 

A reader asks, “Why is 1 not listed as a prime? After all, does it not satisfy the 
stated criteria forprimality?’' This chapter is written in response to this question. 

The layperson usually thinks that mathematics deals with absolute truths, and indeed 
this was how mathematics was viewed during earlier centuries. However, ever since 
the epochal discoveries of Bolyai, Lobachevsky and Riemann that there can be geo¬ 
metries (note the plural) other than the one presented in Euclid’s text The Elements, 
this implicit notion had to be dropped. Even the notion that everything in mathe¬ 
matics is provably true or provably false had to be abandoned, after the astonishing 
results obtained by Godel in 1930. Alongside this development, mathematics has 
seen a pioneering and extremely productive method: the axiomatic method, in which 
new areas of mathematics get created merely by defining suitable sets of axioms. As a 
result, the accent in mathematics has to some extent shifted to the study of axiomatic 
systems, and the essential question in such cases has become one of consistency and 
richness of the axiom system rather than its intrinsic truth or falsity. Much of modern 
algebra, starting with group theory, the theory of fields and rings and vector spaces 
and so on can be viewed in this light. Loosely speaking, one might say that in the 
modern mathematical paradigm, true is roughly equivalent to consistent, while/a Ac 
is equivalent to self-contradictory ^. 

Here are some instances to illustrate the theme of consistency as opposed to abso¬ 
lute truth. In school arithmetic, one encounters the question, “Why is -1 x -1 = 1?” 
Many ‘proofs’ are offered, but the plain fact is that the relation is a convention, not 


* It is an interesting commentary on the psychology of modern mathematicians that, when pressed, most 
of them will readily say that there is no such thing as absolute truth in mathematics, and that a mathemat¬ 
ical proposition is true or false only with reference to a particular axiomatic system. But amongst them¬ 
selves most mathematicians ‘know’ that what they deal with does indeed refer to something ‘concrete’, 
‘real’ and ‘absolute’! 


y 
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an absolute truth, and therefore there is no question of proving it . One adopts it 
because of its implication for the law of distributivity of multiplication over addition 
(LDMA for short), according to which a(b -{■ c) = ab + ac for all a, b, c. The LDMA 
is too valuable an axiom to lose! Here is roughly how it happens. Starting with N the 
set of positive integers, with x and + defined on N in the usual manner, we enlarge 
the set by including 0 and imposing the following rules: 

a + 0 = 0 + a = (3, ax0 = 0x<3 = 0. 

Note that the two statements are consistent with one another because of the LDMA. 
For example, 2x3 = 2x(3 + 0)=2x3 + 2x0, sowe must have 2x0 = 0. Next, 
one constructs the negative numbers via the rule a + (—a) = 0. To do addition we 
call upon commutativity and associativity. For instance we have: 

(-2) + (-3) + (2 + 3) = (-2) + 2 + (-3) + 3 = 0 + 0 = 0. 

Therefore, (—2) + (—3) + 5 = 0 and (—2) + (—3) = —5. 

Finally, multiplication is taken up, and here one invokes distributivity. We find that 
we are forced to adopt the convention that — 1 x 1 = — 1 and —1 x — 1 = 1: 

0 = 0x 1 = ((1 +(-1)1 X 1 = {1 X 1} + {(-1) X 1} = 1 + ((-1) X 1}, 
therefore, (—1) x 1 = — 1; and, 

0= {l+(-l)l x(-l)= {1 x(-l)} + {(-l)x(-l)} =-l + {(-l)x(-l)}, 

therefore, (—1) x (—1) = +1. The point is that we need these relations if we are to 
preserve the LDMA, which we cannot ajford to lose. The consistency of the system 
must be preserved at all cost ^. 

Here is another question, also asked at the school level: Why is <3^ = 1 for all 
fl > 0? We proceed to resolve this in a similar vein. Let x, y € N; then a^'^^ = a^ xa^ 
and a^~y = joA when x > y. These follow from the very meaning of a^ when n 
is a positive integer. What do we do with a®? If we wish to have a system of algebra 
that is consistent and easy to work with, then we need to adopt the convention that 

= \ . There is nothing absolute about this. Rather, we choose to give a^ a meaning 
that makes it easy to deal with. In short, we make a^ a well-behaved object. (Note 
that 0^ cannot be given any consistent meaning, nor can 0/0; that is, it is not possible 
to make these objects well-behaved.) 

Finally we take up the question: ‘7 j 1 a prime?" We recall the fundamental theo¬ 
rem of arithmetic (FTA): Every integer N > 1 can be expressed in just one way as 
a product of primes, except possibly for the order of occurrence of the primes. If 1 
were included in the set of primes P, then the fact that 1 ” = 1 for all integers n would 
require us to rephrase the FTA by adding the clause “... except that I may occur to 


“ Here is a particularly preposterous proof which I encountered a few years back: the parabola y = is 
symmetric in the y-axis, therefore minus times minus equals plus! 

^ Sacrificing the LDMA would mean that we lose the ring structure of Z. 
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any arbitrary power.” We would end up labelling 1 as a special prime, to be excluded 
from most of the interesting theorems about prime numbers. Indeed, what would in 
all likelihood happen is that theorems about primes would end up being phrased in 
terms of the set P' = P\ (1). Thus giving 1 membership in P proves to be a nuisance, 
and it is simpler to keep it out right at the start. 

The matter can be considered from another viewpoint. Let Z denote the set of 
integers, and consider the set of complex numbers of the form a + bi, where a, b G Z, 
and / = v^. These are the Gaussian integers first studied in detail by Gauss, and 
the set of such numbers is denoted by Z(/). (Note that Z is a subset of Z(/).) Now 
in Z, the only elements that possess multiplicative inverses are ±1 (that is, their 
reciprocals lie within the same set); these are the units of Z. In Z(/), the set of units 
turns out to be {± 1, ). (The reader is invited to verify that there are no other units in 

Z(/).) Arithmetic can be done in Z{i) just as it is in Z; for instance, we can factorize 
numbers: 


9 + 7/ = (2 + 3/)(3 - 0, 13 = (2 + 3/)(2 - 3/),.... 

Observe that 13, which is prime in Z, loses its primality status in Z{i). 

We declare a number z ^ Z(/) to be prime if z is not a unit and if in every factor¬ 
ization z = «v, with u,v G Z(/), either « or v is a unit"^. The reader is invited to verify 
that 3, 7 and 2 + 3/ are Gaussian primes, whereas 2, 5 and 13 are composite (because 
2 = (1 + /)(1 - /),5 = (1 + 2/)(l - 2/), etc.). We now have the result: every number 
in m, not 0 or a unit, can be written as a product of Gaussian primes; moreover, 
there is essentially only one way of doing this^. That is, we have an analogue of the 
FTA for the Gaussian integers, provided that the units are not considered as primes. 

Other such number systems can be constructed. Indeed, once one grasps the idea, 
such systems seem to be available in abundance and can be spotted in many set¬ 
tings. For instance, consider the set Z{V2) whose elements are numbers of the form 
a + Z?\/2 where a,b G Z. This system presents itself quite naturally when one tries 
to solve the equation x'^ — 2y'^ = ±1 in integers. A striking fact about Z(\/2) is 
that it has infinitely many units. (The reader is invited to show this. Hint: Show that 
sfl - 1 and its integral powers are units; (harder) show that these are the only units.) 
What are the primes of Z(\/2)? It turns out that V2 is prime, as are the numbers 
3, 5 and 11, but not 7, because 7 = (3 — s/2) x (3 + V2), nor 17, because 
17 = (5 - 2s/2) X (5 + 2s/2). It is an interesting exercise to classify the primes 
of Z(i) and Z(\/2). Is there an analogue of the FTA for Z{s/2)1 The answer is 
“yes”, though it is hard work to prove it. However there are numerous number sys¬ 
tems that closely resemble Z(/) and Z{V2) but which do not have the FTA property. 
An example is Z{VT0): it can be shown that 2,3,4 — VTO and 4 + VTO are primes in 
Z(VTO), yet 

6 = 2x3 = (4- v^) X (4 + a/i0), 


Since this article deals with terminology, it should be pointed out that what we refer to as ‘prime’ here 
is usually called ‘irreducible’ in the standard texts. In the standard definition, p is prime if we have the 
implication p\ab => p\a or p\b. In the class of rings known as UFD’s the two notions coincide. Examples 
of UFD’s are Z, Z(/) and Z(\/2). However Z(VTO) is not a UFD. 

^ Units may enter the picture, hence the use of the words ‘essentially only one way’. 
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providing a counter example to the FTA. As the reader will have noted by now, the 
word prime no longer carries a fixed meaning; it acquires meaning only with refer¬ 
ence to a particular context^. The interested reader can consult the well-known text 
by G H Hardy and E M Wright {An Introduction to the Theory of Numbers, Chapters 
XIV and XV) for further details. 

Here is another example of axiomatic generalization. A rational number can be 
thought of as a root of the equation mx + n = 0, with m,n e Z, m 0; here m = 1 
gives us the integers — we call these the rational integers. Generalizing, we define an 

algebraic number as a root of the polynomial equation ax’^ + bx^~^ +cx'^~^ H-= 0 

with a,b,c,... G Z, a 7 ^ 0 and n G N; if ct = 1 then we have an algebraic integer. 
It is a non-trivial fact that the set A of algebraic integers is closed under addition 
and multiplication but not under division. Thus A behaves very much like Z, and we 
have at hand a genuine generalization of the notion of integer. 

These examples may serve to highlight the extraordinary freedom that the axio¬ 
matic approach brings into mathematics. Some critics complain, however, that in 
exercising this freedom, mathematicians tend to “go too far”; but that is another 
matter altogether and we shall not address it here. 

TaiL-PIECE. Mr T B Nagarajan of Thanjavur has sent me the following problem: 
Find four distinct positive integers such that the sum of any two of them is a square. 
He writes that the problem is not too hard if the restriction on positivity is removed, 
or if one is content with solutions having very large integers. In support of this state¬ 
ment, he lists the following solutions: 

{55967,78722,27554,10082}, (15710,86690,157346,27554}. 

Readers are invited to take a crack at the problem. (To find a triple with the stated 
property is much easier; an example is (6,19,30}. Readers may enjoy trying to list 
further such triples before going on to the more challenging four-number problem.) 


Shailesh A Shirali 
Rishi Valley School 
Rishi Valley 517 352 
Andhra Pradesh 


^ Historically, many of these developments were a result of efforts to prove Fermat’s last theorem. See 
Resonance, Volume 1, No. 1 for more details. 
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To Find Four Distinct Positive Integers such 
that the Sum of Any Two of them is a Square 


S H Aravind 

The problem is to find four distinct positive integers such that any two of them add 
up to a square. Let a, b, c, d with a<b<c<dbQ four positive integers such that 
the sum of any two of them is a square. Observing that 

aAbAcAd^ (^7 + 6 ) + (c + flf), 
a-\-b-\-c-\-d = (^7 + c) + (/? + df 
a-\-b-\-c-{-d = d) + (/? + c), 

we need to find a number which can be written as a sum of two non-zero squares in 
three different ways. We proceed to find such a number. 

To begin with, note that if two numbers n and n' can each be expressed as a sum 
of two squares, then nn' can also be so expressed in two ways. Indeed, if 

n = k'^ + / 2 , «' = + /' 2 , 

then 

nn' = (kk' + ll'f + (kf - Ik'f 
= (kl' + lk'f-{-(kk'-ll'f. 

Start with 25 and 13 both of which are sums of two squares, 25 = 3^ + 4^, 
13 = 2 ^ -f 3 ^. By the identity, 25 x 13 = 325 can be expressed as a sum of two 
squares: 325 = 102 + 152 = 62+172 = l2 +182. (Note that 102+152 = 52(22 + 32).) 

We show that 13 x 252 three representations as a sum of two squares and gives 
us a solution. Consider the following representations (among others): 

8125 = 302 + 852 ^ 5q2 ^ 7^2 ^ 3^2 ^ 

5 
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Thus, we take a + b + c + d = 8125 and look for solutions a, b, c, d in positive 
integers. Then a-k-b,a-\-c,a-\-d,b-\-c,b-\-d,c-{-d are precisely 30^, 85^, 50^, 75^, 
58^, 69^ in some order. We have 

a-\-b<a-\-c<a->rd<b-\-d<c + d, 
and a-\-b<a + c<b-\-c<b-\-d<c-\-d. 

We arbitrarily take /? + c to be less than a d. So we have a + b = 30^, the least of 
the squares, a + c = 50^, the next smallest, and c + b = 58^. We get c — b= 1600 and 
solving for c, b we have c = 2482, b = 882. From this we get a ■= 18, <7 = 4743. Thus 
{18, 882, 2482,4742) is a set of four positive integers with the required property. 

The same method can give large integer solutions too. For example, the following 
solutions can be obtained by choosing suitable squares: 

(4190,10210,39074,83426), (7070,29794,71330,172706). 


S H Aravind 
12, First Main Road 
Ponmeni Jayanagar 
Madurai 625 010 
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Bachet’s Problem 


B Bagchi 

A grocery shopkeeper keeps five stones of different weights. He is able to use 
a common balance and weigh out quantities ranging from 1 to 100 kg, in steps 
of 1 kg. What are the weights of these five stones? 

The above is the problem “100 kg with five stones” posed by R Yusufzai in the 
“Think it Over” column of the July 1996 issue of Resonance. A much better problem 
will result if the figure 100 is replaced by 121. This is because the question ''What are 
the weights of these five stonesT' seems to suggest that there are uniquely determined 
weights to be found! However, as may easily be verified, the weights in kg of the 
stones might be 1,3, 9, 27 and m, where m is any integer in the range 60 < w < 81. 
In fact, there are many other solutions to the problem as posed. If, however, it was 
given that the grocer can weigh any object of weight between 1 kg and 121 kg (in 
steps of 1 kg) using his five stones, then the weights (in kg) of the stones must have 
been 1, 3, 9, 27 and 81. This is the case /c = 5 of the result stated and proved below. 

The problem is a well-known variation of an old problem due to Bachet (see 
Suggested Reading). In the original binary version, the grocer cannot subtract, so 
he must put the stones in one pan and the object in the other. Mr Yusufzai’s problem 
is an instance of the ternary version where this restriction is removed. The general 
problem (in its ternary version) may be stated as follows: 

Given a positive integer k, find the largest integer Nk such that any 
object whose weight is an integer between 1 and (ends included) can 
be weighed using k stones of suitable integral weights. In this notation, 
the problem is to show that Ns > 100. 

In fact, we have; 

Q A: j 

Theorem. ^ f —- If k stones are such that all integral weights between 1 

and Nfc can be measured using them, then the weights of these stones must be 3^, 
0 < J < k — 1. 


7 
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This is, essentially, Theorem 141 in the book by Hardy and Wright (see Suggested 
Reading). 

In order to prove this, we must convert it into a precise mathematical statement. 
To this end, let ao,..., aic-\ be the (positive integral) weights of k stones. In order to 
weigh an object of integral weight m, the grocer places the object together with some 
of the stones on the right pan (say) and puts some other stones on the left pan. For 
0<j<k-\, put £j = 1 if the stone of weight aj is placed on the left pan, sj = - \ 
if it is on the right pan, sj = 0 if it is not used. Since the two pans must balance, we 
get 

k-\ 

m = ^ SjGj, where € {0,1, - 1) for 0 < j < k - (1) 

j = 0 

This leads us to 

Definition. \i a = [gq, ... ,aic-\] is a finite set of positive integers, then the 
CGpacity C{A) of A is the largest integer M such that for every integer m in the 
range 1 < m < M, equation (1) has a solution. 

Informally, the capacity C(A) is the largest M such that all weights between 1 and M 
can be measured using k stones whose weights are in A. In terms of this definition, 
the above theorem may be restated as follows: 

Theorem. If A is of size k, then C{A) < ^ ^ . Equality holds here if and only if 
A = 13-^ : 0 < j < k — \ ]. 

To prove the theorem, note that if m can be written as in (1), then so can -m (just 
change the signs of all fys); also, trivially, m = 0 can be written thus (take sj = 
0 for all j). Therefore, if C{A) = M, then all the 2M + 1 integers m in the range 
-M < m < M can be expressed as in (1). But there are three choices for ej for 
each j, hence only 3^ choices for the right hand side of (1). Hence, 2M +1 <3^, 
or C{A) < (3^ - l)/2. Now, if we take A = {3^ : 0 < y < A: - 1), then for 
1 < m < (3^ - l)/2 write [(3^ - l)/2] - m in base 3: 

^ k-\ 

[(3'=-l)/2]-m= Y^SjV, 

j = 0 

where (5y G {0,1,2}. Put sj = I - Sj. Then (1) holds. Thus C(A) > (3^ - l)/2 for 
this set. Together with the previous inequality, we get C(A) = (3^ — l)/2. 

Only the uniqueness part of the theorem remains to be proved. In fact, this is the 
only non-trivial and interesting part. To prove this, let A = [gq, ..., Gk-\] have 
capacity Nk. Since, now, equality holds in the inequality C(A) < (3^' - l)/2 which 
appears in the statement of the theorem, the proof of the inequality shows that every 
integer m in the range [-(3*^ - 1 )/2] < m < [(3*= - 1 )/2] has a unique representation 
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(1); conversely, any m of the form (1) belongs to this range. Therefore, letting X be 
an indeterminate, we get 

k-\ 

n + 1 + X"' (2) 

as may be verified by multiplying out the left-hand. Since, in particular, the largest 
integer (viz., = form (1) must be the largest integer in the range 

o /c 1 o /c 1 

[--- j" ^^ ^ we also have 


k-\ 

;=o 


3*-l 


Using (3) and a little algebra, (2) simplifies to 


k-\ 


n 


- 1 
- 1 


X - \ ' 


(3) 


(4) 


Now fix y; 0 < y < /c - 1. Let w be a primitive 3ayth root of unity. That is, w is a 
complex number such that = 1 if and only if / is an integral multiple of 3ay. (For 
instance, we may take w = exp(2;r\/^/3ay).) Then w is a zero of the left-hand side, 

and hence also of the right-hand of (4). Thus, = 1. So 3ay divides 3^. That is, 
Gj e {3', 0 < / < /c - 1}. Since this holds for all y, we have AC {3' : 0 < / < k ~ 1). 
Since both sets have size k, we must have A = (3' :0</<A:-l). This proves the 
uniqueness of the set of given size and maximum capacity. 

The reader may like to look up the proof in the book by Hardy and Wright, which is 
very different from the proof given here. It is a clever use of mathematical induction. 


TaiL-PIECE. Bachet is better remembered by mathematicians for another reason. It 
was on Bachet’s edition of Diophantus’ Arithmetic that Fermat scribbled his famous 
marginal notes. Bachet was also the first man to state, (without proof) what is now 
known as Lagrange’s four square theorem: every natural number is the sum of at 
most four perfect squares. 


Suggested Reading 

[1] F Schuh. The Master Book of Mathematical Recreations. Dover. New York, 
pp 115-118, 1968. 

[2] G H Hardy and E M Wright. An Introduction to the Theory of Numbers. Oxford 
Univ. Press. London, pp 115-117, 1971. 
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Mathematical Induction 

An Impresario of the Infinite 

B Sury 


In the natural sciences, if a certain phenomenon is observed to occur a number of 
times, often a general law is formulated. This process is called empirical induction. 
In general, any reasoning that draws a general conclusion based on verification of 
particular cases is known as induction. But, in mathematics, a statement involving 
a natural number n might turn out to be erroneous even if it happens to be true for 
the first ten, or thousand, or even million natural numbers. For instance, the numbers 
2^“ + 1 = 3, 2^' + 1 = 5, 2^^ + 1 = 17, 2^^ + 1 = 257, 2^' + 1 = 65537 are all 
prime numbers and the 17th century mathematician Pierre de Fermat suggested that 
2 + 1 must be prime for every positive integer n. However, a century later, another 

great mathematician Leonhard Euler showed that 2^ + 1 = 641 x 6700417. An even 
more convincing example is the following. If we evaluate the expression 991n^+l for 
small values of n, the resulting number is not the square of a whole number. But, for 
n = 12055735790331359447442538767, the value is a perfect square. Indeed, this is 
the smallest value of n for which it is a square! This tells us that, in mathematics, a lot 
of care is needed to establish an induction procedure which proves a mathematical 
theorem for each of an infinite sequence of cases, without exception. The method of 
mathematical induction is such a procedure. Let us start with a simple example. 

Suppose we want to prove the statement that 2^ > n for every natural number 
n. Clearly, this inequality holds for n = 1. Now, to prove the inequality for all nat¬ 
ural numbers, we consider an arbitrary natural number k > \. We assume that the 
inequality 2^ > k holds. Then, for the next natural number /c -I- 1, 2^'^^ =2x2^ > 2k 
by our assumption that 2^ > k. Now, 2k = k + k > k 4- so that the inequality 
2^'^^ > k 4- \ follows. Thus, we have proved that if the inequality is true for any 
particular k, then it is also true for k 4- 

The crux of the above argument rests on the points: 

(0) Given an infinite sequence of statements P,., P^+i,..., we would like to 
prove that there is a ‘next’ to any statement, and each particular statement can 
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be reached in a finite number of steps starting from the ‘first’ state¬ 
ment Pr. 

(1) There is a general method of proving that for any n > r, if is true, then 
Pn+\ is true; and 

(2) The first statement Pf is known to be true. 

It is believed that these rules of logic are as fundamental to mathematics as the 
classical rules of Aristotelian logic. 

It is necessary to verify both steps (1) and (2) to avoid landing in absurdities. 
For example, if step (2) that ‘starts induction’ is not verified, one can ‘prove’ that 
all natural numbers are equal as folfows. For, simply denote by the statement 
'n = « -f r. Then, obviously, if is assumed to be true, then n = n -{■ \ and so 
« + 1 = « + 2, which means that is also true. 

Everybody has seen instances of mathematical induction being applied. The 
summing of arithmetic and geometric progressions are usually done by this method. 

An important point is in order here. Mathematical induction can be used to prove 
a statement that is given to begin with. As for coming up with that statement itself 
(as a guess, say), it is altogether a different matter. Therein lies the creative element 
which cannot be pinned down by any general rules. 

As we observed earlier, mathematical induction is a procedure that involves such 
extremely ‘believable’ logic that we accept it as valid reasoning. But, interestingly, 
we can actually prove its validity if we assume another believable principle which 
is that any non-empty set of positive integers has a least number. That this principle 
gives a proof of the validity of mathematical induction is left as an exercise to the 
reader. 

We now proceed to give various instances where the method of mathematical 
induction appears and proves fruitful. 

The following is a slight variant of the form in which induction is used: 

To prove an infinite sequence Pk,Pk+\^' .., of assertions, one verifies the two steps: 

(i) Pjc is true. 

(ii) For any « > /c, if we assume that all the assertions P^, Pm^ • ■ ,Pn hold good, 
then Pn+] also holds true. 

Induction in Geometry 

As an example, let us show that the sum of the interior angles of a (not necessarily 
convex) polygon of n sides is 180(« - 2) degrees for all n > 3. Call this statement 
Pn. P 3 is true as the sum for a triangle is 180 degrees. P 4 is also true since any 
quadrilateral can be split into two triangles. 

Now, let n > 4 and we assume that Pk is true for k = 3,4,...,«- 1. Let 
A\, A 2 , ..., be the vertices of a polygon with n sides. We first notice that there is 
always a diagonal (i.e., a segment AiAj that is not a side) that splits the polygon into 
two with smaller numbers of sides. To see this, consider three neighbouring vertices 
A, B, C. Consider all the rays emanating from B and filling the interior angle ABC. 
We terminate any ray when it first meets a side or a vertex of the polygon. Either all 
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Figure 4.1 Figure 4.2 

these rays intersect only one side (Figure 4.1) or they intersect more than one side 
(Figure 4.2). In the first case, AC is a diagonal that splits the original polygon into a 
triangle and a polygon with n—\ sides. In the second case at least one ray terminates 
on a vertex other than A or C. Call such a vertex, D. Then, BD is a diagonal splitting 
the polygon into two, with smaller numbers of sides. 

Therefore, in general, let A\Ak denote a diagonal which splits the polygon 
A\A 2 . .. An into the polygons A\A 2 ... Ak and AkAk+i^^ • AnA\ of k and n - k + 2 
sides respectively. By induction hypothesis, Pk and Pn-k +2 are true, i.e., the sum of 
the interior angles of the original polygon A\A 2 . An is lS0{k — 2) + lS0(n — k) = 
180(at — 2) degrees. So, Pn is true, which proves by induction that Pr is true for every 
r > 3. 

After this standard example, we look at an example where it may not be quite 
apparent that induction can be used. 


The Marriage Problem 

The classical ‘marriage problem’ can be stated as follows. Suppose that each of a 
set of girls is acquainted with a subset from a given set of boys. Is it possible for 
each girl to marry one of her acquaintances? Obviously, a necessary condition is that 
every set of m girls be collectively acquainted with at least m boys. That this suffices 
is the assertion. Here is a proof by induction. 

Let n denote the number of girls. If n = 1, the assertion is trivial. If at > 1 and 
if it is true that every set of m girls, \ < m < n, has at least m + 1 acquaintances, 
then an arbitrary girl is allowed her choice and the rest are referred to the induction 
hypothesis. If, on the other hand, some group of m girls, I < m < n, has precisely 
m collective acquaintances, then this set of m girls is married off by induction and, 
it is indeed true that the rest of the at - m girls satisfy the necessary condition with 
respect to the remaining boys. If this were not so, then some set of s spinsters with 
I < s < AT-ATT would know fewer than 5 bachelors, and this set of 5 spinsters together 
with the m just-married girls would have known fewer than s + m boys. 

The reader is invited to apply induction to solve the following two problems. 

Exercise. Consecutive Number Problem: Agatha and Beula are ‘given’ two con¬ 
secutive natural numbers n and at -f 1. Both know that the numbers are consecutive 
but neither knows whose number is bigger. After every minute a beep is heard and 
each is asked to simultaneously say out aloud whether she knows the other’s number. 
Prove by induction on the smaller number n that the person who has the number n 
guesses correctly after precisely the ATth beep. 
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Exercise. Macaulay Expansion: Given a natural number d >2, let us write down 
the J-tuples of positive integers in a strictly decreasing order. Order the tuples lex¬ 
icographically. Prove that the number of tuples appearing prior to a particular tuple 

{kci,kci-u...,k\) is precisely ( ^^ ) + ( )+••• + ( ^ 

This proves that any n has a unique expansion 


n = 




where kd > k^-i > • • • > /ci. Here ^ j denotes the binomial coefficient which is 0 
when n < r. 


Induction Incognito—Use of a ‘Dummy’ Element 

Look at the following statement: 

If a\ < 02 < ''' < On+l integers from the set {1 , 2,. .., 2/i}, then Oi divides aj 
for some i < j. 

This can be proved by the ‘pigeon-hole principle’ as follows. Write <3/ = 2^‘li with // 
odd. Then, /i,..., being n+ 1 odd numbers between 1 and 2n cannot be different. 
If // = Ij = I with i < j, then, clearly ai = 2^‘l divides aj = 2^^7. 

In terms of economy and elegance, this is unbeatable. However, we find to our 
surprise that even induction works and, in fact, proves the following more general 
statement: 

Let r > 1, and let A C {1,2,..., 2''n] be a subset of cardinality iX — \)n + \. Then, 
there exists a chain of r + \ elements of A with each dividing the next. 

Let us prove the original statement (corresponding to r = 1). Note that it is clearly 
true for n = 1. Assume it is true for n. Consider now n + 2 numbers a\ < • • • < an+2 
among 1 to 2n + 2. If a^+i < 2n, we are done by the induction hypothesis. In the 
contrary case, we must have a^+i = 2n + I and = 2n + 2. If one of the o/’s 
is n + 1, we are done as it divides a^+2- So, suppose ai ^ n + \ for any i. We may 
also assume that none of the n numbers ai,..., divides another or else we have 
nothing to prove. Now, we put in this new number n A \ (as a ‘dummy element’) to 
get n A \ numbers between 1 and 2n. By induction hypothesis, one of these n A \ 
numbers divides another. Since this has happened only after the advent of the new 
number nA 1, it must be that either: (i) some a/ (/ < n) divides nA 1, or (ii) n A I 
divides some a, (i < n). But, clearly (ii) cannot happen as « -f 1 ^ a^ <2n. Thus, 
some at (i < n) divides nA\ and, therefore, divides 2n a2 = an+2 also. Thus, we 
used 37 -f 1 as a ‘dummy element’ in this proof. 

The reader is urged to complete the proof of the general statement along the same 
lines. 

Now, we come to a final example where induction appears in a different guise. 
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Backward Induction 


If a statement is easily proved for a particular infinite subsequence of positive inte¬ 
gers, it might be worthwhile to try and see whether ‘backward induction’ works. By 
this, we mean the following. Suppose we want to prove statements Pn for all positive 
integers n. Suppose, further, that it is easy to check the veracity of Pn for all n in an 
infinite sequence of natural numbers. Then, if we check that for any m >2 the truth 
of Pm implies the truth of Pm-i, the statements Pn follow for all positive integers n. 

An instance is the familiar arithmetic mean-geometric mean inequality 



for arbitrary non-negative real numbers a, , where equality holds if, and only if all the 
numbers are equal. 

On the one hand, we prove this for n = 2^ by induction on k. Let A: = 1. Then, 
{a\ + a 2 )^ > 4(31(32 with equality exactly when a\ = (32, since the dilference 
{a\ + a 2 )^ - 4aia2 = {ai - (32)^. Assume that P„ is true for n = 2^, r < k. Let 
<3/,/ < 2^'^^, be non-negative real numbers. Then, 2/<2^+i <3/ = 2/<2^ where 
bi = a 2 i-\ + (32/. Therefore, 



= Yl a, = n ai, 

/<2/c+i /<2^+i 


which proves that P 2 k+\ is true. Hence, by induction, P 2 r is valid for all r > 1. 
Moreover, note that the above proof also shows that the equality (2/<2^+' = 

2(/c+1)2‘*' a: implies that all inequalities occurring on the way are equalities, 

which again proves by induction that equality, can hold in Pzr if, and only if all the 
(3/’s are equal. 

On the other hand, for any m, the validity of Pm implies the validity of Pm-\ as 
follows: 

Let ai,..., am-i be given. Consider am = Then, 


( z« 

/<m-l 


m 


m 


m-\ 


z - 


m 


i<m-l 


n".)( Z«)-""n«- 

i<m-l /<m-l i<m 

Once again, by induction, equality implies that all the numbers are equal. 
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To end our discussion, the reader is invited to apply induction on the positive 
integer p below to prove the following result which solves an interesting two-player 
game called Euclid. 

Let {p, q) be a pair of positive integers satisfying p > q. Each player subtracts a 
multiple of the smaller number from the bigger one without making the result nega¬ 
tive. The winner is the one first hitting the highest common factor of p and q. Then, 
there is a winning strategy for the first player if and only if q < j(V5 — \)p. 


Suggested Reading 

[1] R Courant and H Robbins. What is Mathematics? Oxford University Press, 
1941. 

[2] L I Golovina and I M Yaglom. Induction in Geometry. Little Mathematics 
Library. Mir Publishers. Moscow, 1979. 
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On the Infinitude of the Prime Numbers 

Euler’s Proof 

Shailesh A Shirali 

Euclid’s elegant proof that there are infinitely many prime numbers is well known. 
Euler proved the same result, in fact a stronger one, by analytical methods. This 
article gives an exposition of Euler’s proof introducing the necessary concepts 
along the way. 


Introduction 

In this article, we present Euler’s very beautiful proof that there are infinitely many 
prime numbers. In an earlier era, Euclid had proved this result in a simple yet elegant 
manner. His idea is easy to describe. Denoting the prime numbers by • • •. 

such that Pi = 2,P2 = 3,P3 = 5,..., he supposes that there are n primes in all, the 
largest being p„. He then considers the number N where 


N = p\P2P2>->Pn + E 


and asks what the prime factors of N could be. It is clear that N is indivisible by 
each of the primes pi, P2, P3, • • •, (indeed, N = 1 (mod pi) for each /, 1 < / < n). 
Since every integer greater than 1 has a prime factorization, this forces into existence 
prime numbers other than the p/. Thus there can be no largest prime number, and so 
the number of primes is infinite. 

The underlying idea of Euler’s proof is very different from that of Euclid’s proof. 
In essence, he proves that the sum of the reciprocals of the primes is infinite’, that is, 

1 1 1 

-1-1-H''' = 00. 

PI P2 P3 

In technical language, the series 2/ 1 / Pi diverges. Obviously, this cannot possibly 
happen if there are only finitely many prime numbers. The infinitude of the primes 
thus follows as a corollary. Note that Euler’s result is stronger than Euclid’s. 
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Convergence and Divergence 

A few words are necessary to explain the concepts of convergence and divergence 
of infinite series. A series a\ + (32 + 03 + • • •, is said to converge if the sequence of 
partial sums, 

a\,a\ + ( 32 , a\ + (32 + < 33 ,..., 

approaches some limiting value, say L; we write, in this case, 2]^ 

instead, the sequence of partial sums grows without any bound, we say that the series 

diverges, and we write, in short', ( 3 / = 00 . 

Examples. 

. • The series 1 /I + 1/2 + 1/4 + • • • + 1 /2'^ + • • • converges (the sum is 2, as is 
easily shown). 

• The series 1/1 + 1/3 + 1/9 + - —hi /3'^ + • • • converges (the sum in this case 
is 3/2). 

• The series 1 + 1 + 1 + • • • diverges (rather trivially). 

• The series l-l + l- l + l- l + l- -- - also fails to converge, because the 
partial sums assume the values 1,0, 1,0, 1,0,... and this sequence clearly 
does not possess a limit. 

• A more interesting example: 1 - 1/2 + 1/3 - 1 /4 + • • • a careful analysis 
shows that it too is convergent, the limiting sunt being In 2 (the natural loga¬ 
rithm of 2). 


Divergence of the Harmonic Series Hl/i 

In order to prove Euler’s result, namely, the divergence of ^ 1/p/, we need to estab¬ 
lish various subsidiary results. Along the way, we shall meet other examples of diver¬ 
gent series. To start with, we present the proof of the statement that 

1111 

-1-1-1-h--- = 00. 

12 3 4 

This rather non-obvious result is usually referred to as the divergence of the har¬ 
monic series. The proof given below is due to the Frenchman Nicolo Oresme and it 
dates to about 1350. We note the following sequence of equalities and inequalities: 

1 _ 1 

T “ T’ 

1 _ I 

2 " 2 ’ 


’ A statement of the form Za,- = 00 is to be regarded as merely a short form for the statement that the 
sums a\,a\ + a2,a\ + 02 + • • •. do not possess any limit. It is important to note that 00 is not to be 

regarded as a number! We shall however frequently use phrases of the type ‘jv = 00’ (for various quantities 
x) during the course of this article. The meaning should be clear from the context. 
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1 1 
3+4 > 
1111 


1 1 
9 To 


1 

16 ^ 


1 1 _ 1 

4 4 ~ 2 

11111 

—I- 1 - 1 — = — 

8 8 8 8 2 ’ 

1 J_ j_ _ ^ 

—+ —+ "- + Tg-2’ 


and so on. We see that it is possible to group consecutive sets of terms of the series 
1/1 + 1/2 + 1/3 + • • •, in such a manner that each group has a sum exceeding 1/2. 
Since the number of such groups is infinite, it follows that the sum of the whole series 
is itself infinite. (Note the crisp and decisive nature of the proof!). 

Based on this proof, we make a more precise statement. Let S{n) denote the sum 


1 1 1 

-+-+-+ 
1 2 3 


+ 


1 

n 


e.g., 6’(3) = 11/6. Generalizing from the reasoning just used, we find that 


>S’(2")>1 + ^. (1) 

(Please fill in the details of the proof on your own.) This means that by choosing n 
to be large enough, the value of can be made to exceed any given bound. For 
instance, if we wanted the sum to exceed 100, then (1) assures us that a mere 2^^^ 
terms would suffice! This suggests the extreme slowness of growth of S{n) with n. 
Nevertheless it does grow without bound; loosely stated, S{oo) = oo. 

The result obtained above, (1), can also be written in the form. 


‘Sin) > 1 + ^log2/i. 


Exercise. Write out a proof of the above inequality. 

A much more accurate statement can be made, but it involves calculus. We consider 
the curve Q whose equation is y = \ /x, x > 0. The area of the region enclosed by 
Tl, the x-axis and the ordinates x = 1 and x = nis equal to /” jdx, which simplifies 
to In n. Now let the region be divided into (n - \) strips of unit width by the lines 
x = l,x = 2,x = 3,...,x = /i (see Figure 5.1). 

Consider the region enclosed by Q, the x-axis, and the lines x = / - 1, x = /. 
The area of this region lies between 1// and l/(/ - 1), because it can be enclosed 
between two rectangles of dimensions 1x1// and 1 x l/(/ — 1), respectively. (A quick 
examination of the graph will show why this is true.) By letting / take the values 
2,3,4,... n, and adding the inequalities thus obtained, we find that 

11 1 11 1 

- + - + ln„ < _ + _ + _ (2) 

Relation (2) implies that 

,1111 I 

In « + + - + - + In n+i, 

n 1 2 3 n 


( 3 ) 
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Figure 5.1 The figure shows how to bound In n by observing that In n is the area 
enclosed by the curve y = 1/x, the x-axis and the ordinates x = 1 and x = n. 

and this means that we have an estimate for S(n) (namely, In n + 0.5) that differs 
from the actual value by no more than 0.5. A still deeper analysis shows that for 
large values of n, an excellent approximation for S{n) is In « + 0.577, but we shall 
not prove this result here. It is instructive, however, to check the accuracy of this 
estimate. Write / (n) for \n n + 0.577. We now find the following: 


n = 

10 

100 

1000 

10000 

100000 

S(n) = 

2.92897 

5.18738 

7.48547 

9.78761 

12.0902 

fin) = 

2.87959 

5.18217 

7.48476 

9.78734 

12.0899 


The closeness of the values of f(n) and S(n) for large values of n is striking. (The 
constant 0.577 is related to what is known as the Euler-Mascheroni constant.) 

In general, when mathematicians find that a series o/ diverges, they are also 
curious to know how fast it diverges. That is, they wish to find a function, say / (n), 
such that the ratio ( ai)/ f{n) tends to 1 as /7 oo. For the harmonic series Yj ^ /P 
we see that one such function is given by /(«) = In n. This is usually expressed by 
saying that the harmonic series diverges like the logarithmic function. We note in 
passing that this is a very slow rate of divergence, because In n diverges more slowly 
than for any e > 0, no matter how small e is, in the sense that In njn^ 0 as 
n 00 for every £ > 0. Obviously the function In In n diverges still more slowly. 

Exercise. Prove that if a > 1, then the series 

1 1 1 

— _j_ — _|- — _j_... 

1« 2^ 3^ 

converges. (The conclusion holds no matter how close o is to 1, but it does not hold 
for <3 = 1 or a < 1, a curious state of affairs!) Further, use the methods of integral 
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calculus (and the fact that for a ^ the integral of 1/x^ is ^^/{\ — a) to show 
that the sum of the series lies between Xj^a— 1) and aj^a — 1). 

The fact that the sum 1/1 + 1/2^ + 1/3^ + • • • is finite can be shown in another 
manner that is both elegant and elementary. We start with the inequalities, 2^ > 
1 X 2,3^ > 2 X 3,4^ > 3 X 4,, and deduce from these that 


1 1 1 1 
1 H - r H-r.H-r + 

22 32 42 


< 1 + 


+ 


+ 


1x2 2x3 3x4 


+ 


The sum on the right side can be written in the form, 


1 4- 



+ 




(4) 


which (after a whole feast of cancellations) simplifies to 1 + 1/1, that is, to 2. (This is 
sometimes described by stating that the series ‘telescopes’ to 2.) Therefore the sum 
1 + 1 /22 + 1 /32 + 1 /42 + • • • is less than 2. We now call upon a theorem of analysis 
which states that if the partial sums of any series form an increasing sequence and 
are at the same time bounded, that is, they do not exceed some fixed number, then 
they possess a limit. We conclude, therefore, that the series 1 //2 does possess a 
finite sum which lies between 1 and 2 . 

The divergence of the harmonic series was independently proved by Johann Berno¬ 
ulli in 1689 in a completely different manner. His proof is worthy of deep study, as 
it shows the counter-intuitive nature of infinity. 

Bernoulli starts by assuming that the series 1/2+ 1/3 + 1/4+-- - (note that he 
starts with 1/2 rather 1/1) does have a finite sum, which he calls S. He now proceeds 
to derive a contradiction in the following manner. He rewrites each term occurring in 
S thus: 

i-^-1 1 _L i_ 

3“6“6'^6’ 4 “ T 2 “ T 2 ’’’T 2 I 2 ’ 

and more generally, 

\ n-\ 1 1 1 

_ = - =-1-1- • • • H-, 

n n{n — n{n — 1) n{n — \) n{n — \) ' 

with (n - 1 ) fractions on the right side. Next he writes the resulting fractions in an 
array as shown below: 


1/6 

1/12 

1/20 

1/30 

1/42 

1/56 

1/6 

1/12 

1/20 

1/30 

1/42 

1/56 


1/12 

1/20 

1/30 

1/42 

1/56 



1/20 

1/30 

1/42 

1/56 




1/30 

1/42 

1/56 





1/42 

1/56 






1/56 


Note that the column sums are just the fractions 1 /2, 1 /3, 1 /4, 1 /5,...; thus, S is the 
sum of all the fractions occurring in the array. Bernoulli now sums the rows using 
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the telescoping technique used above (see equation (4)). Assigning symbols to the 
row sums as shown below, 


he finds that: 


A 

B 

C 

D 


1111111 
— -|- — -j- - -j- - -4- - -4- - -4- - ■■ -1- • • . 

2 6 12 20 30 42 56 ’ 

111111 
-1-1-1-1-1-h--- 

6 12 20 30 42 56 ’ 

11111 
"■ -j- " ■■ -j- - -4- - - -4- • • • 

12 20 30 42 56 ’ 

1111 

- 1 - 1 - 1 -!-••• 

20 30 42 56 ’ 



1 

2 ’ 


C = -, (arguing likewise), 


D = 


1 

4’ 


and so on. Thus the sum S, which we had written in the form A + B + C + D+-’, 
turns out to be equal to 


1 1 1 

l + ;r + ~ + ~ + 

2 3 4 


Now this looks disappointing—just as things were beginning to look promising! 
We seem to have just recovered the original series after a series of very complicated 
steps. But in fact something significant has happened: an extra ‘1’ has entered the 
series. At the start we had defined S to be 1/2 + 1/3 + 1/4 + -- -; now we find that 
S equals 1 + 1 /2 + 1/3 + 1 /4 + • • •. This means that N = N + 1. However, no finite 
number can satisfy such an equation. Conclusion: N = oc! 

There are many other proofs of this beautiful result, but I shall leave you with the 
pleasant task of coming up with them on your own. Along the way you could set 
yourself the task of proving that each of the following sums diverge: 


• 1/1 + 1/3 + 1/5+ 1/7+ 1/9 + ---; 

• 1/1 + 1/11 + 1/21 + 1/31 + 1/41 + •••; 

• 1/fl+ !//?+!/c+l/<7+'--, where a, b,c,d,..., are the successive terms of 
any increasing arithmetic progression of positive real numbers. 
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Elementary Results 

The next result that we shall need is the so-called fundamental theorem of arithmetic: 
every positive integer greater than 1 can be expressed in precisely one way as a 
product of prime numbers. We shall not prove this very basic theorem of number 
theory. For a proof, please refer to any of the well-known texts on number theory, 
e.g., the text by Hardy and Wright, or the one by Niven and Zuckermann. 

We shall also need the following rather elementary results; (i) if k is any integer 
greater than 1, then 


1 _ 1 1 1 1 
1-iA ■ 

which follows by summing the geometric series on the right side, and (ii) if <3/, 
are any quantities, then 


(5) 

bj 


where, in the sum on the right, each pair of indices (i,J) occurs precisely once. 

Now consider the following two equalities, which are obtained from (5) using the 
values k = 2, k = 3:- 


1 _ 1 1 1 1 
1 - 1 / 2 “ 

1 ,1111 
- - J -[- — -j- - -j_ - -j_ - -j_ , , , 

1-1/3 3 32 33 34 

We multiply together the corresponding sides of these two equations. On the left side 
we obtain 2 x 3/2 = 3. On the right side we obtain the product 

(1 + 1/2 + 1/2^ + 1/2^ + ...) X (1 + 1/3 + 1/3^ + 1/3^ + •■•)■ 


Expanding the product, we obtain: 


1 1 1 1 

'+2^4+8+' 

1 1 1 

+ 7 + —r + 7“ -!-••• 

6 12 24 


1 1 1 

‘ 3 9 27 ■ 

1 1 1 

T -j- — -f" 

18 36 72 


that is, we obtain the sum of the reciprocals of all the positive integers that have only 
2 and 3 among their prime factors. The fundamental theorem of arithmetic assures 
us that each such integer occurs precisely once in the sum on the right side. Thus we 
obtain a nice corollary: If A denotes the set of integers of the form 2^ 3^, where a 
and b are non-negative integers, then 
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If we multiply the left side of this relation by (1 + 1/5 + 1/5^ + 1 /5^ + • • •) and 
the right side by 3/(1 - 1/5), we obtain the following result: 

1 - 1/5 " T’ 

where B denotes the set of integers of the’ form 2^ 3^ 5^^, where a, b and c denote 
non-negative integers. 

Continuing this line of argument, we see that infinitely many such statements can 
be made, for example: 

• If C denotes the set of positive integers of the form 2^ 3^ 5*^ 7^, where a, b, c 
and d are non-negative integers, we then have YjzeC = (15/4)(7/6) = 
35/8. 

• If D denotes the set of positive integers of the form 2^ 3^ 5^1^ 11^, then 
Z,ed1A = (35/8)(11/10) = 77/16. 

Infinitude of the Primes 


Suppose now that there are only finitely many primes, say p\,P2, P3, • ■ ■, Ptu where 
P\ = 2, p 2 = 3, p 3 = 5,.... We consider the product 

111 1 
1-1/2 1-1/3 1-1/5 1 - l/pn 

This is obviously a finite number, being the product of finitely many non-zero frac¬ 
tions. Now this product also equals 


(' + 2*? + 

1 + 1 + 2 +.. 
5 52 


/, I 1 

( 1 + 3 +P+... 

f, ' ' 

X ( 1 •+• — -|- —— -j- 


X 


Pn 


Pn 


When we expand this product, we find, by continuing the line of argument developed 
above, that we obtain the sum of the reciprocals of all the positive integers. To see 
why, we need to use the fundamental theorem of arithmetic and the assumption that 
2, 3,5,. .. are all the primes that exist; these two statements together imply that 
every positive integer can be expressed uniquely as a product of non-negative powers 
of the n primes 2, 3,5,... From this it follows that the expression on the right 
side is precisely the sum 

1111 
T 2 3 4 ’ 

written in some permuted order. But by the Oresme-Bernoulli theorem, the latter 
sum is infinite! So we have a contradiction: the finite number 


111 1 
1 - 1/2 1 - 1/3 1 - 1/5 ‘' ’ 1 - 1 /pn 

has been shown to be infinite—an absurdity! The only way out of this contradiction 
is to drop the assumption that there are only finitely many prime numbers. Thus we 
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reach the desired objective, namely, that of proving that there are infinitely many 
prime numbers. 

Note that, as a bonus, there are several formulas that drop out of this analysis, 
more or less as corollaries. For instance, we find that 

111 111 
_1-^-^ = — — + -r + *- -, 

1-1/22 1 - 1/32 1 - 1/52 22 32 42 

that is, the infinite product and the infinite sum both converge to the same (finite) 
value. By a stunning piece of reasoning, including a few daring leaps that would 
leave today’s mathematicians gasping for breath, Euler showed that both sides of the 
above equation are equal to 71 ^/ 6 . Likewise, we find that 

111 111 
^ _ [ _...= 1-1_I_I-!-•••, 

1 _ 1 /24 1 - 1 /34 1 - 1 /54 24 34 44 

and this time both sides converge to 7r4/90. Euler proved all this and much much 
more', it is not for nothing that he is at times referred to as oncilysis inccirncite! 


The Divergence of Hl/p 


As mentioned earlier, Euler showed in addition that the sum 



1111 


is itself infinite. We are now in a position to obtain this beautiful result. For any 
positive integer n > 2, let denote the set of prime numbers less than or equal to n. 
We start by showing that 



1 

1 -1/p 



( 6 ) 


Our strategy will be a familiar one. We write down the following inequality for 
each p € Pn, which follows from (5): 


1 

1-1/p 


1 1 ^ i 

>1 + - + — + — + ••• + 

P p2 pi 



The ‘>’ sign holds because we have left out all the positive terms that follow the term 
1 Multiplying together the corresponding sides of all these inequalities {p e Pn), 
we obtain: 



1 

1-1/p 


>n 

peP„ 


P 



When we expand the product on the right side, we obtain a sum of the form Xye/t ^ /J 
for some set of positive integers A. This set certainly includes all the integers from 1 
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to n because the set Pn contains all the prime numbers between 1 and n. Inequality 
(6) thus follows immediately. 

Next, we already know (see equation (3)) that 


V 1 1 1 , 

> - > In « H— > In 


y=i 


Combining (6) and (7), we obtain the following inequality: 


Ht-- 


1 


P^Pn 


1/P 


> ln«. 


Taking logarithms on both sides, this translates into the statement 


P^Pn 


1 


1-1/p 


> In Inn. 


(7) 


( 8 ) 


Our task is nearly over. It only remains to relate the sum ^/P 

the left side of (8). We accomplish this by showing that the inequality 


lx , 1 

— >ln-- 

j \ — X 


(9) 


holds for 0 < X < 1 /2. 

To see why (9) is true, draw the graph of the curve F whose equation is y = 
ln(l/(l — x)), over the domain —oo < x < 1, (see Figure 5.2). Note that F passes 
through the origin and is convex over its entire extent. (PROOF: Write /(x) = - In 
(1 - x); then f'{x) = 1/(1 — x) and f''(x) = 1/(1 — x)^ > 0 for all x < 1.) 



Figure 5.2 The graph shows that for 0 < x < 1 /2, we have (2 In 2) x > In 
(1/(1 - x)) for 0 < X < 1/2. 
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The convexity of F implies that the chord joining the points A(0,0) and 
B(l/2, In 2) lies completely above the curve. The equation of AB is y = (2 In 2) x, 
so that over the range 0 < x < 1 /2 we have the inequality: 

(2 In 2) X > In 



Since ln2 0.69315 < 0.7 = 7/10, (9) follows. 
Inequality (9) implies that 



for X = 1/2, X = 1 /3, X = 1 /5,... . Therefore, by addition. 



Combining (8) and (10), we deduce that 



> - In In n. 


( 10 ) 


As « 00 , the right side diverges to infinity, therefore so does the left side; so we 

reach our desired objective, that of showing the divergence of \/Pi- 


An Alternative Proof 


Here is an alternative proof of the claim that Yji 1 /Pi diverges. The proof has been 
written in an ‘old-fashioned’ style and purists will protest. Nevertheless, we shall 
present the proof and let the readers judge for themselves. Let S denote the sum 
\/pi. We shall make use of the following result: 

> 1 4* X for all real values of x. 


with equality holding precisely when x = O.The graphs of and 1 + x show why 
this is true; the former graph is convex over its entire extent (examine the second 
derivative of to see why), while the latter, a line, is tangent to the former at the 
point (0, 1), and lies entirely below it everywhere else. Substituting the values x = 
1/2, X = 1/3, X = 1/5,..., successively into this inequality, we find that 


, 1/2 


^ ^ T c 


/3>l+> .... 


2’ 3' ' ' ■ ■ 5 

Multiplying together the corresponding sides of these inequalities, we obtain: 


e 


S 


> 
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The infinite product on the right side yields the following series: 


, 1111111111 


This series is the sum of the reciprocals of all the positive integers whose prime 
factors are all distinct; equivalently, the positive integers that have no squared factors. 
These numbers are sometimes referred to as the quadratfrei or square-free numbers. 
Let Q denote this sum. We shall show that this series itself diverges, in other words, 
that Q = 00 . This will immediately imply that S = oo (for e^ > Q), and Euler’s 
result will then follow. 

We consider the product 


Qx 



111 \ 

i2 + P + 52+---j- 


This product, when expanded, gives the following series: 


1111 

that is, we obtain the harmonic series. To see why, note that every positive integer 
n can be uniquely written as a product of a square-free number and a square; for 
example, 1000 = 10 x 10^,2000 = 5 x 20^, 1728 = 3 x 24^, and so on. Now when 
we multiply 


1 1 1 1 1 1 1 1 1.1 
(' + 2 + 3 + 5 + 6 + 7 + T0 + TT+B + T4 + T5 + 


with 

we find, by virtue of the remark just made, that the reciprocal of each positive integer 
n occurs precisely once in the expanded product. This explains why the product is 
just the harmonic series. Now recall that the sum 


1 1 1 1 

iH——:r + 
22 32 42 


is finite (indeed, we have shown that it is less than 2). It follows that 


Q X (some finite number) = 00 . 

Therefore Q = 00 , and Euler’s result (2,- 1/p/ = 00 ) follows. QED! 

Readers who are unhappy with this style of presentation, in which 00 is treated as 
an ordinary real number, will find it an interesting (but routine) exercise to rewrite 
the proof to accord with more exacting standards of rigour and precision. 
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XVI. Summa seriei infinita harmonice progressionalium, y + 5 + 5 + + 5 ^ 
c.est infinita. 

Id primus deprehendit Prater: inventa namque per praeced. Summa 
seriei 5 + ^ + visurus' porro, quid emergeret ex ista 

serie, j | ^ & C; Si resolveretur methodo Prop. XIV. 

collegit p opositionis veritatem ex absurditate manifesta, quae sequeretur, si 
summa. Seriei harmonicae finita statueretur. Animadvertit enim, 

Seriem A, 2 + 5 + I + 5 + 3 + 7,&c.oo. (fractionibus singulis in alias, 
quarum numerator's sunt 1, 2 , 3 , 4 , & c. transmutatis) 

Seriei B, 2 + | + -^ + 4 + 4 + :n> & c.ooC + D + E + F,&c. 


20 ^ 30 ^ 42 


C 

D 

E- 

F- 


5 + 5 + n + ® + M + 45> &c. ooperprac.j 


+ 




+ 


JL 

12 


lo 


30 ^ 42 
30 ^ 42 


2 

&c. 00 D — ^ 00^ 


&c. 00 E — Y 00^ 


&c. 00 


1 


&c. 




^ ooG; unde 


sequitur, seriem G 00 A, totum parti, si summa finita effet. Ego 

Johann’s divergence proof, from Jakob’s Tractatus de Seriebus Infinitis, republished in 1713. 
(From page 197 of Journey through Genius by William Dunham.) 


Conclusion 

A much deeper—but also more difficult—analysis shows that the sum 1 /pi + 1 /P2 + 
l/p 3 + • • • + l/p„ is approximately equal to In In n. This is usually stated in the 
following form: as n tends to oo, the fraction 

Vpi + 1/P2 + 1/P3 + • • • + \/pn 
In In n 

tends to 1. This is indeed a striking result, reminiscent of the earlier result that 1/1 + 
1/2 + 1/3 + • • • + 1/rt is approximately equal to In n. It shows the staggeringly 
slow rate of divergence of the sum of the reciprocals of the primes. The harmonic 
series 1//, diverges slowly enough—to achieve a sum of over 100, for instance, 
we would need to add more than 10"^^ terms, so it is certainly not a job that one can 
leave to finish off over a weekend. (Do you see where the number 10^^ comes from?) 
On the other hand, to achieve a sum of over 100 with the series Yji ^/Pi^ we need to 

add something like 10^^ terms!! This number is so stupendously large that it is a 
hopeless task to make any visual image of it. Certainly there is no magnitude even 
remotely comparable to it in the whole of the known universe. 








On the Infinitude of the Prime Numbers 29 


Suggested Reading 

[1] G H Hardy, E M Wright. An Introduction to the Theory of Numbers. 4th ed. 
Oxford. Clarendon Press, 1960. 

[2] Ivan Niven, Herbert S Zuckermann. An Introduction to the Theory of Numbers. 
Wiley Eastern Ltd., 1989. 

[3] Tom Apostol. An Introduction to Analytic Number Theory. Narosa Publishing 
House, 1979. 


Shailesh a Shirali 
Rishi Valley School 
Rishi Valley 517 352 
Andhra Pradesh 


6 


On Fermat’s Two Squares Theorem 


Shailesh A Shirali 

Introduction 

The purpose of this chapter is to present a proof of the two squares theorem: every 
prime of the form 1 (mod 4) can be written as a sum of two squares. The theorem was 
first stated by Fermat (as usual, with no proof!) and later proved by Euler. The proof 
given here is an elaboration of the one presented by Don Zagier in a crisp note that 
appeared in The American Mathematical Monthly, Vol. 97, # 2 (Feb 1990). As Zagier 
himself remarks in his paper, his proof is not constructive. In the final section we 
make an interesting conjecture which, if correct, will provide a constructive version 
of Zagier’s proof. 

Throughout, p refers to a fixed prime of the form 1 (mod 4), while N refers to the 
set of positive integers. For a finite set X,\X\ denotes the cardinality of X. 


Proof of the Two Squares Theorem 

The proof hinges on a study of the solutions in positive integers of the equation 
x'^ + 4yz = p. Let Sp denote the solution set: 

Sp = {(x,y,.^) G +4yz = p). (1) 

It is easy to verify that Sp is non-empty (for (1,1, ^-^) € Sp) and finite. We shall 
show that \Sp\ is odd. 

Consider the following relations: 

x'^ + 4yz = (x -f 2z)^ + 4z{y - x - z) = (2y - x)^ -f 4y(x - y z)- (2) 

From this we see that a, p, y as defined by 

3^, :^) = (x -f 2^, y - X - z). (3) 


30 
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= {'^y - - y + z), (4) 

y, ^) = (^ - 2y, X - y + y), (5) 

are maps of the solution set in real numbers of + Ayz = p into itself; still bet¬ 
ter, they are unimodular maps — they permute the integer solutions amongst them¬ 

selves. (This can be checked by observing that the matrices corresponding to the 
three maps are all unimodular, that is, they have determinant ±1.) Since our interest 
lies chiefly in the positive integral solutions, we define subsets Ap, Bp and Cp of Sp as 
follows: 


= {(x,y,z) ^ Sp,x < y - z], 

(6) 

Bp = [{x,y,z) e Sp,y - z < X < 2y], 

(V) 

Cp = {(x, y, z) € Sp,2y < x}. 

(8) 


We now make the following observations which are easy to verify. 

• Sp - Ap Kj Bp 0 Cp, that is, Ap, Bp and Cp constitute a partition of Sp. 

Equality cannot hold in any of the defining inequalities because p is prime. 
Moreover, (1,1, € Bp. 

• a maps Ap into Cp and y maps Cp into Ap; moreover, a and y are inverses of 
one another. Since Ap and Cp are finite sets, it follows that \Ap\ = \Cp\. 

• p maps Bp into itself, and p is its own inverse (it is an involution), so it pairs 
up elements of Bp with one another, except possibly for the fixed points— 
the triples (x, y, z) which get mapped to themselves; these have no mates and 
stand alone. 

• p has just one fixed point. For, if (x, y, z) is a fixed point, then (2y - x, y, 
X — y -1- z) = (x, y,.z), so X = y. This gives p = x(x + 4z), implying that x = 1 

_ j 

and X Az — p since p is prime. It follows that (1,1, is the sole fixed 
point of p. 

• Bp is odd, for p is an involution on Bp with just one fixed point. In turn this 
implies that \Sp\ is odd (because \Ap\ = \Cp\). 

Observe that for each element (x,y, z) G Sp, its ‘mate’ (x, y) also lies in Sp. 
Since Sp has an odd number of elements, it follows that Sp must contain an ‘odd man 
out’ which is its own ‘mate’. If (r, s, s) is such an element of Sp, then p = r^ + (2s)^, 
and we are through. 


Towards a Constructive Proof 

Note that the proof presented is not constructive—it provides no clue as to how the 
desired (r, s) can be computed for a given p. (Curiously, this is true for most known 
proofs of the theorem.) However, the argument used does suggest the possibility of 
an algorithmic proof. I have empirically found that the following algorithm ‘works’, 
in the sense that it always seems to terminate. However, I have not been able to devise 
a proof of termination; if found, then a constructive proof of the two squares theorem 




32 Number Theory 


is at hand. ^ Perhaps some reader would like to take up the challenge and settle the 
matter. 

Consider the set Ip of integer triples (x, y, z) for which x + 4yz = P- dhe set is 
non-empty, for (1,1, € Ip. Our objective is to find a triple in Ip of the form 

(r, 5 ,5); this would immediately provide the desired representation of p as a sum of 
two squares (p = r^ + i2s)'^). Towards this end we define a function /: Ip Ip as 

follows: 

( (x + 2z,y - z- x,z) if z + X < y, 

/ y, ^) - I - X, + X - y, y) if ^ + x > y. 


Example. Let p = 17; then /(1,1,4) = (1,4,1) and /(1,4,1) - (3,2, 1). 

We now compute the orbit of the triple (1,1, ) under action by /. If at some 

stage we reach a triple of the form (r, s, 5 ) we terminate the computation. The curi¬ 
ous thing is that we always seem to reach such a triple. Listed below are the initial 
segments of the orbits for a few p’s. In each case we stop when the desired triple is 

reached. 

• p = 17 

(1, 1 ,4) (1, 4 , 1), (3, 2, 1), (1. 2, 2); result: 17 = P + 4^, 

• p = 29 

(1, 1,7), (1,7, 1), (3,5, 1), (5, 1, 1); result; 29 = 52 + 2^, 

• p = 41 

(1, 1,10), (1, 10, 1), (3, 8, 1), (5, 4, 1), (3, 2,4), (1, 5,2 ), 

(5, 2, 2); result: 41 =5^+4^. 

• p = 53 

(1, 1, 13), (1, 13, 1), (3, 11, 1), (5, 7, 1),(7, 1, 1); 
result; 53 = 7^ + 2^. 

• p= 109 

(1, 1,27), (1, 27, 1), (3, 25, 1), (5, 21, 1), (7, 15, 1), (9, 7, 1), 

(5, 3, 7), (1, 9, 3), (7, 5, 3), (3, 5, 5); result; 109 = 3^ + lO^. 

Any takers? 


Further Remarks 

• Weil writes, in his book (see Suggested Reading) that “all known proofs begin 
... by showing that -1 is a quadratic residue of p = 4n + V\ This being so, 
Zagier’s proof is rather atypical. 


' This conjecture was subsequently settled in the affirmative by B Bagchi; see Chapter 7. 
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The theorem was slated by Fermat in 1640; he never published any proof but 
in all likelihood did possess one, probably based on the principle of infinite 
descent (which itself is one of Fermat’s inventions). The first published proof, 
by Euler, appeared in the 1740’s; it too uses the principle of infinite descent. 


Suggested Reading 

Andre Weil. Number Theory: An Approach Through History, 1984. 


Shailesh a Shirali 
Rishi Valley School 
Rishi Valley 517 352 
Andhra Pradesh 


7 


Fermat’s Two Squares Theorem Revisited 


B Bagchi 

The Two Squares Theorem 

Throughout this article, p is a prime such that p = 1 (mod 4). JN and Z will denote, 
as usual, the set of all natural numbers (excluding zero) and the set of all integers 
(positive, negative or zero), respectively. Recall that the celebrated two squares the¬ 
orem (first stated by Fermat and proved by Euler) says that p can be written as a sum 
of two perfect squares. Clearly one of these two squares must be even (and the other 
one is odd). Therefore, this theorem may be formulated by saying that there exists 
{x, y) e IN X IN such that x'^ -h 4y^ = p. Any such pair (x, y) will be referred to as 
a representation of p. (Actually, as is well known, the representation is unique. For 
proof, see for instance Niven and Zuckerman in Suggested Reading.) 


Permutations 

G H Hardy writes that the two squares theorem ‘is ranked, very justly, as one of the 
finest in arithmetic’. So it comes as a surprise to learn that its finest proof was found 
only in 1990. In that year, D Zagier modified a proof of the two squares theorem due 
to Heathbrown to create a remarkably short and elegant proof. Although Zagier’s 
proof was presented in detail by Shirali in Resonance (see Suggested Reading), we 
shall begin with a brief account of this proof. To do so, we need to recall some facts 
about permutations. 

If Z is a finite set, then by a permutation of X we mean a function from X into 
itself under which each element of X has a unique pre-image. If k and a are any two 
permutations of Z, then we can form their ‘product’ kg by composition: kg{x) := 
k{g{x)), X in Z. If Z is of size n, there are only n\ permutations of Z and they form 
a group with this product rule. (Though, strictly speaking, we need no group theory 
for this article,'familiarity with the elements of this theory will still be useful.) Since 
we have defined the product of any two permutations, in particular we can form the 
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powers K - K , ;r ,..., of any given permutation n. Since there are only finitely 
many distinct permutations of X, some two of the powers of n must actually be 
equal. By cancellation, it follows that there must exist a natural number m such that 
is the identity permutation id fixing all elements of X. The smallest such number 
is called the order of n. A permutation of order two is called an involution. 

Any permutation ;r of A breaks up (‘partitions’) X into one or more parts such 
that two elements x and y of A are in the same part if and only if some power of n 
takes X to y. These parts are called the orbits of tt. The singleton orbits are just the 
fixed points of tt. A permutation of A is said to be transitive on A if it has only one 
orbit (namely, the whole of A). 

It is easy to convince oneself that the size of any orbit of a permutation divides the 
order of the permutation. In particular, if the permutation tt has prime order q, then 
(as 1 and q are the only divisors of q) each orbit is either a fixed point or has size q. 
It follows that, in this case, the number of fixed points of tt is congruent modulo q 
to the size n of A. Hence n has a fixed point if n is not a multiple of q. As a special 
case {q = 2) of this observation, we see that an involution of A has a fixed point in 
A if A is an odd set (i.e., the number of elements of A is odd). This is the key fact 
which makes Zagier’s proof (and its constructive versions presented here) work. 


Zagier’s Proof 

Now we come to Zagier’s proof. Let S denote the subset of IN x IN x IN defined by 

^ = {{x, y, z) ^ IN X IN X IN : + 4yz = p)• 

Clearly A is a finite set. Zagier defines two involutions a and /? of 5' by 

{ (x + 2z, Z, y - X - z) if X < y - z, 

{2y- X, y, X + z- y) if y - z < x <2y, 

{x -2y, X + z - y, y) if X > 2y. 

/?(x,y, z) = (x, z, y). 


The involution a of the finite set S has a unique fixed point (namely (1,1, It 

follows that S is an odd set. Therefore, the involution (3 of the odd set S must have 
an odd number (hence at least one) of fixed points in S. But (x, y) (x, y, y) is a 
bijection of the set of representations of p onto the set of fixed points of /?. Hence 
p has at least one representation (as a sum of two squares). This completes Zagier’s 
proof of the two squares theorem. 


Shirali’s Conjecture 

Zagier notes in his paper that his proof ‘is not constructive: it does not give a method 
to actually find the representation of p as a sum of two squares’. Perhaps provoked 
by this statement, S A Shirali gave a conjectural way to ‘constructivize’ this proof. 
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Shirali’s conjecture may be phrased as follows. Define a finite subset S oi Z y. IN 
xJN by 


S = ((x., y,z)^ZxlNxJN:x-{-y>z and + 4yz = p]- 

Define a function / : 5 5 by 

, ( (x + 2z, y-^x - z, z) if x + z<y, 

y^^’y’^^=[(2y-x,x + z-y,y) if x + z>y. 

Then, Shirali conjectures that the orbit of the point (1, 1) under y contains a 

point of the form (x, y, y). That is, to obtain a point (x, y, y) G 5 (and hence a square 
plus square representation of p), begin with the point (1, 1) and look at the 

successive iterates (powers) of y on this point until a point (x, y, y) is obtained. 

(Actually, Shirali defines his function on the (infinite) set of all points (x, y, z) 
in' Z X Z X Z satisfying x^ + 4yz = p, and proposes to begin with the a-fixed 

point (1,1, ^-^)- However, we observed that this function fixes the finite subset S 
introduced above and on this subset restricts it to y as defined. Though the a-fixed 
point itself does not belong to this subset, its image under Shirali’s original function 

is (1, 1), which does belong. Therefore, our formulation of the conjecture is 

entirely equivalent to Shirali’s original formulation.) 


A Constructive Version of Zagier’s Proof 

Notice that the function f is a ‘perturbation’ of the permutation / := af] of S obtained 
by composing Zagier’s involutions a and So it is natural to ask if Shirali’s con¬ 
jecture is valid with y replaced by y. In the following theorem, we show that this 
modified conjecture is indeed coiTect. Note that we now stay within the set S, and 
this is closer to Zagier’s original proof. 

Theorem. Let k denote the size of the orbit T under y := ap which contains the 
a-fixed point a. Then k is odd; T contains a unique y^-fixed point b and is given 
by the formula b = fact, the orbit T satisfies the symmetry relation 

yk~^~'^{a) = p{y'^(a)) forO < n < k - \ . 

Thus, to obtain a /?-fixed point (x, y, y) (and hence the representation p = x^ 
+ (2y)^), begin with the a-fixed point and iterate a(3 on it; in a finite number of steps 
you will reach a /?-fixed point. This theorem shows that exactly half the orbit has to 
be traversed before this point is reached; the remaining half may be found (in reverse 
order) simply by applying /3 to the first half. 

Proof. Since a and jS are involutions, a ‘normalises’ y: aya~^ = pa = y~^. 
Therefore, a maps the orbits of / to orbits of /. (To see this, let si and ^2 be two 
points from a common /-orbit. By definition, this means that there is an integer Q 
such that = S2. Then 0 ( 52 ) = ay^{si) = a/^a-'(a(i])) = r"^(ct(i|).) 

Thus, whenever 5i and S 2 in S are from a common /-orbit, a( 5 j) and a(52) sfs also 
in a common /-orbit. So the image under a of any /-orbit is again a /-orbit.) In 
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particular, if T is the orbit under y which contains the fixed point a of a, then (x{T) is 
an orbit which meets the orbit T in this fixed point, hence we must have (x(T) = T. 
Since the restriction of a to T is an involution of T with a unique fixed point, it 
follows as before that T is an odd set. Since both a and y fix T, so does fi = ay. Thus 
the restriction to T of ^ is an involution of the odd set T, and hence (3 must have a 
fixed point b in T. So there is an 0 < ^ < /c — 1, such that b = y^{a) is fixed by 
(3. To prove the uniqueness of this fixed point, it suffices to show that A: = 2^ + 1 is 
forced on us. 

For m G Z, we have (3iy^(b)) = (3y^I3~\l3{b)) = y~'^{b). Substituting y^{a) for 
b, we find that the orbit T has a two-fold symmetry around its ^th term: 

y^+^(a) = (3{y^-^\a)) Vm G Z. 

In particular, taking m = ^ -P 1 in this identity, we get y^^'^^{a) = f3y~^{a) 
- (3^a{a) = a{a) = a. From the definition of k, one sees that an integer h satis¬ 
fies y^(a) = a iff /i is an integral multiple of k. Since h = 2^ + \ satisfies this 
condition, 2^ -f 1 is a multiple of k. Since 1 < 2^ + 1 < 2k, this forces 2F + I = k. 
Finally, substituting Q - m - - n in the displayed identity, we get the last 

assertion of the theorem. 


Shirali’s Conjecture Vindicated 

A. 

Define the involutions d and /? of the finite set as follows: 


a(x, y,z) = {2z - x,x + y - z, z), 


P{x,y, z) 


y, z) if xF z<y, 
(x, z,y) if x + z> y- 


One readily verifies that (i) these are indeed involutions of S, (ii) a has a unique 
fixed point, namely d := (1, 1), and {x,y) (x, y, y) is a bijection from the 

' A _ 

representations of p onto the fixed points of f3. Thus, in Zagier’s proof, one may 

A 

replace a, f3 and S hy d, (3 and S, respectively. Finally, Shirali’s function / is related 

A 

to these involutions by y = d(3. Therefore, the indicated substitutions in the proof of 
the above theorem yields a ‘hatted’ version of the theorem. In particular, this proves 
Shirali’s conjecture. 


Uniqueness of the Square Plus Square 
Representation of p 

Aside from being non-constructive, Zagier’s proof has another shortcoming. As 
already mentioned, the prime p has a unique representation as a sum of two squares. 
Or, what amounts to the same thing, f3 also has a unique fixed point in S. But this does 
not emerge from Zagier’s proof (or from its constructive variations given above). We 
are unable to remedy this defect. Notice, however, that in view of the uniqueness 
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assertion in the above theorem, it would suffice to show that / acts transitively on S. 
(For, this would mean that T = S, and we know that /? has a unique fixed point in 
r.) Computations by hand show that this is indeed correct for primes below hundred. 
One might therefore be tempted to conjecture that, generally, y acts transitively on 
S. If correct, this would provide a neat explanation for the uniqueness of the /?-fixed 
point. Unfortunately, this conjecture is incorrect. Its validity for small primes turns 
out to be yet another instance of the ‘strong law of small numbers’. (If you have 
never heard of this law then you are urged to take a look at the beautiful article by 
Guy[l].) 

We see this as follows. 

For each fixed x, the number of points in S with the given first coordinate equals 
d{ - ). Therefore we have the formula 

X ^ / 

where the sum is over all odd numbers x in the range 1 < x < ^/p. (Here d(') is the 
usual divisor function: for n G lN,d{n) is the number of divisors of n including 1 
and n.) 

Let p be of the form + 4 (for an odd number k). Then, in the iterates under y 
of the point a = (1,1, the first coordinate increases in steps of two until the 

point b = {k,\, \) is reached, then it decreases in steps of two until we reach the end 
point (1, 1) of the orbit. This shows that in this case, the size k of the orbit T 

is related to the prime p by p = + 4. Also, the sum in the formula for #{S) given 

above has {k + l)/2 terms of which one term equals 1 while the remaining (A: - l)/2 
terms are >2. Since d{n) = 2 iff« is a prime, it follows that/or a prime of the form 
p = k?" A, y is transitive on S (i.e., k = #{S)) iff {p — x^)14 is a prime for all odd 
numbers x in the range \ < x < k. This shows, for instance, that we do not have 
transitivity for p = 229. 


Inefficiency of the Algorithm 

Clearly, the a-f algorithm needs at most 2 #(^) steps. Since d{n) = 0 {n^) and 

the formula for ^{S) has 0 {p^) terms in it, the number of necessary iterations is 

0{p'^^^). The example of primes of the form square plus four (presumably there are 
infinitely many such primes) shows that this estimate is close to the best possible. 
Wagon describes known algorithms whose complexity is polynomial in log p, and 
the a-p algorithm compares very unfavourably (see Suggested Reading). But it may 
be that we have looked at the worst case, and for some large class of primes its per¬ 
formance is much better. Moreover, it may be possible to significantly improve on 
the peiformance of the algorithm as follows. The set S can be partitioned into three 
parts on each of which y is linear (the permutation y is even better in this respect; we 
have a partition of S into two parts on each of which y is linear). The runs of iteration 
during which the iterates stay in the same piece of S may easily be combined into a 
single step. 






Fermat’s Two Squares Theorem Revisited 39 


A Combinatorial Lemma 

The perceptive reader may have suspected by now that the theorem presented above 
does not have much to do with primes or their representations by squares. This is 
indeed correct, and the theorem is a manifestation of a combinatorial phenomenon. 
We have: 

Lemma. For any two involutions a and, p of a finite set S, there are only three 
possibilities for any af-orbit: (i) neither involution has a fixed point in the orbit, or 
(ii) each of them has a unique fixed point in the orbit, or (Hi) one of them has two 
fixed points in the orbit while the other has none. 

At first glance, this statement may look very strange. (For readers with a reasonable 
amount of familiarity with groups and group actions, here is a hint for a group- 
theoretic proof of this lemma: think of the group of isometries of a regular polygon.) 
But here is an elementary (‘graph-theoretic’) proof. 

• Let Y = ap. Fix a y-orbit T. If neither a nor p has a fixed point in T, then there is 
nothing to prove: we are in case (i) of the lemma. So assume that one of these two 
involutions has at least one fixed point. Then, arguing as in the proof of the above 
theorem, one sees that T is fixed by both a and p. Thus T is a union of a-orbits 
as well as of ^^-orbits. If T is a singleton, then we are in case (ii) and again there 
is nothing to prove. So we may assume that T has at least two elements. Hence no 
element of T is fixed by y. 

Now consider the graph G defined as follows. The vertices of G are the elements 
of T. Two distinct elements x,y of T are joined by an edge in G if (and only if) 
y z= a{x) or y = p{x) (i.e., if {x, y} is an orbit of one of the involutions). Clearly, 
this is an undirected graph. Note that, for each x in T,a(x) and ^(x) are distinct 
elements of T—or else x would be fixed by y, contrary to our assumption. It follows 
that each vertex x is of degree 1 or 2 in G (i.e., x is Joined to one or two vertices), 
according to whether x is or is not fixed by one (and only one) of the two involutions. 
Since we have assumed that at least one of them has a fixed point in T, it follows 
that G has at least one vertex of degree one. Also, since y = is transitive on T 
(T is a y-orbit!), it follows that G is connected. Now, here is the punch line: the only 
connected graphs with all vertices of degree <2 and at least one vertex of degree 1 
are the paths. Hence G is a path. So G has exactly two vertices of degree 1 (the two 
ends of the path) and hence we are in case (ii) or (iii). This proves the lemma. 

Exercise: Continue this argument to see that if the elements of T are arranged on 
a circle according to the action of y, then the two ends of G are placed opposite to 
each other. This explains the symmetry observed in the theorem. 


A Prime Testing Algorithm? 

If n = \ (mod 4) is a number (not necessarily a prime) which is not a perfect square, 
then S, a, p may be defined as before with n replacing p. What happens if one runs the 
a-p algorithm in this case? Our combinatorial lemma shows that if we look inside the 
orbit T containing the fixed point (1,1, ^^) of a, either we may find a fixed point 
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of p and hence a representation of n as a sum of two squares, or we find a second 
fixed point (x, x, z) of a and hence a nontrivial factorisation n = x(x + 4z) of n. 
The second case is bound to occur if the square free part of n has a 3 (mod 4) factor 
(since in this case n has no representation as a sum of two squares). In the former 
case, of course, we are unable to decide whether at is a prime or not (for instance, 
this case occurs if n is a number of the form + 4, even when n is composite). If, 
however, we happen to know a two squares representation of n and the algorithm is 
lucky enough to produce a second representation, then we can still conclude that n is 
composite (because a prime has at most one such representation). Perhaps it will be 
interesting to characterise those numbers n for which the first case occurs. 


Suggested Reading 

[1] R K Guy. The strong law of small numbers. Amer. Math. Monthly. Vol. 95, 
No. 8, pp 697-711,1988. 

[2] I Niven and H S Zuckerman. An Introduction to the Theory of Numbers. Third 
edition. Wiley, 1972. 

[3] S A Shirali. On Fermat’s two squares theorem. Resonance. Vol. 2, No. 3, 
pp 69-73, 1997. 

[4] S Wagon. The Euclidean algorithm strikes again. Amer. Math. Monthly. Vol. 97, 
No. 2, pp 125-126, 1990. 

[5] D Zagier. A one-sentence proof that every prime p = 1 (mod 4) is a sum of two 
squares. Amer. Math. Monthly. Vol. 97, No. 2, p 144, 1990. 
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Factoring Fermat Numbers 

A Unique Computational Experiment for Factoring Fg 

C E Veni Madhavan 

o k 

Fermat observed that the numbers = 2^ + 1 , /: = 0, 1 , 2, 3, 4 are prime, and 
wondered whether this was true for all k. Euler found that the very next Fermat 
number is composite: F 5 = 2^^ + 1 = 641 x 6700417. So far it has been verified that 
Fk,5 < k < 22 are all composite. No one knows whether any other F^ is prime. The 
numbers Fk grow rapidly with k —each is almost a square of the previous number— 
and it is a very difficult task to decide their primality. We give below an outline of 
the relevant computational challenges. 

First note that, if k is odd, 3 divides 2^ + 1 and in general, 2^ + 1 divides 2 ^^ + 1 . 
Thus, if k is not a power of two, 2^ + 1 is not prime. Fermat hazarded a guess 
that the converse was also true. In 1877, Francois Pepin published a necessary and 
sufficient condition which states that , k > 1 is prime if and only if Fk divides 
5 (/ 7 - 1)/2 _j_ I jpjjg condition is the basis for determining whether Fk is prime for 
any given k. Failure of this condition means that Fk is composite. It does not reveal 
any information about the factors. 

Today, sophisticated number theoretic methods and powerful computing platforms 
are used for testing primality and factoring of large integers. These find applications 
in many practical problems such as cryptography. The recent records in Fermat num¬ 
ber factoring have been achieved by means of two techniques called number field 
sieve (NFS) and elliptic curx’e method (ECM). 

The complete factoring of F 9 , which has about 150 decimal digits was carried out 
in 1992 by a unique computational experiment. Hundreds of computers in different 
parts of the world, working independently and in their spare time generated certain 
seed numbers. These computers sent their seeds by electronic mail to a host computer 
in USA. The host carried out the combination of the seeds and the factoring. The 
NFS method, requiring the generation of an enormous number of such seeds, was 
thus eminently suitable for this exercise. However, this method is quite difficult to 
implement. 


41 
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Last year the number F 22 was determined to be composite, using Pepin’s criterion 
and extremely fast arithmetical algorithms implemented on supercomputers. This 
number of about 1.3 million decimal digits (about 500 times as long as this chapter) 
required about 10 ^^ arithmetical operations and about seven months of real time. 
Complete factorization of Fermat numbers is known only for k < 9 and k = 11 . No 
prime factors of ^14 and F 20 are known. 


C E Veni Madhavan 
Department of Computer 
Science and Automation 
Indian Institute of Science 
Bangalore 560 012 
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The Class Number Problem 

Binary Quadratic Forms 

Raj at Tandon 

X 

Introducing the reader to the notion of ‘class numbers’, this chapter defines 

class numbers the way they arose in the study of ‘binary quadratic equations’. 

Remember the formula learnt in school. Indians have a long 

history of work on quadratics. The high point seems to have been when Brahmagupta 
in the early seventh century gave a method by which, knowing one solution (x,y) 
in integers of the equation cX^ + 1=7^ (Pell’s Equation!), where c is a constant 
integer, he could generate an infinite family of solutions. But I am interested here in 
the above formula. I will always assume that a, b, c are integers. The quantity b^-4ac 
under the square root sign gives us information about the quadratic aX^ + bX + c. 
For instance, it tells us whether the quadratic has any real roots—it must be positive 
for this to be so. It tells us whether the quadratic has any rational roots—it must 
be a perfect square for this to be so. We call it the discriminant of the quadratic 
aX'^ + bX + c. The class number problem is concerned with the following questions: 

1) Given an integer A, are there any quadratics F{X) = aX + bX + c, (a, b,c 
integers) whose discriminant b^ — Aac equals A? 

2) If so, how many such quadratics exist? Can we classify them in any way? 

It is obvious that if A = - Aac, then 4 divides A or 4 divides A - 1, i.e., A = 0 

or l(mod 4). This is a necessary condition for there to be an integral quadratic with 
discriminant A. It is a simple exercise to show that it is also sufficient. So we have 
a complete answer to the first question. The second question is considerably more 
complex. 

Before proceeding further let me give a quick recap of the notion of an equivalence 
relation. A relation on a set S is called an equivalence relation if it is reflexive 
(x ~ x), symmetric (x ^ y => y ~ x) and transitive (x ~ y and y ^ x ~ z). 
Let [x] denote the subset of S consisting of elements equivalent to x. It is called 
an equivalence class; note that any two equivalence classes are either identical or 
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disjoint. Then S is the disjoint union of distinct equivalence classes. We denote the 
set of equivalence classes by 5/~. 

Suppose we replace Xhy X-\-\ in the quadratic aX'^ + bX + c. We have a{X + 1 
+ b(X + 1) + c = aX'^ + (6 + 2a)X 4-(a + b 4-c). The discriminant of this is (b + 2a)^ 
-4a{a 4 - b 4 - c) = b^ - 4ac. So the discriminant does not change if we replace 
F{X) = aX^ 4 -bX 4 -c by F(Z + 1) and hence hy~FiX4-2), F{X4-3),.... Similarly, 
for F(X - 1), F(X — 2) etc. Notice also that X — — 4ac is symmetric in a and 

c, i.e., if we replace aX^ + bX + c by cX'^ + bX 4 - a then the discriminant does not 
change. This indicates that it might be better to replace F{X) = aX^ + bX + c by the 
corresponding homogeneous polynomial in 2 variables F(X, Y) = aX^ + bXY 4- 
cY^. Then instead of the transformation X + 1 we take the transformation 

X ^ X 4 -Y,Y ^Y. 

Let r = ^ Q I ^ and W = ^ 0 )' ^ ^ members of SL{2, Z), the 

group of 2 X 2 matrices with integer coefficients and determinant 1. If 

A = ^ ^ f ^ 6’L(2, Z) and F is a homogeneous quadratic polynomial in two 

variables, we denote by A • F the polynomial obtained by replacing X by aX 4 - PY 
and YhyrX + 8 Y. Observe that if A, F G SL{2, Z) then A • (F • F) = AF • F. It is 
easy to check that if F has discriminant A, then so does A • F for any A G SL{2, Z). 
Denote by A(A) the set of all integral homogeneous quadratic polynomials in two 
variables of discriminant A. 

We define an equivalence relation on A(A) by F ~ G if either F = G or there 
exists a chain Fi, F 2 ,... ,Fn in A(A) such that F = Fi, G = F^ and each F/+i is 
either T • F/ or F”* • F/ or W • Fj', such a chain is called a chain from F to G. It 
is easy to see that this gives an equivalence relation on A(A). Hence A(A) can be 
partitioned into equivalence classes. We remark that it can be shown that SL{2,Z) 
is generated by T and W and hence two forms F and G are equivalent if and only if 
there exists an A G SL{2, Z) such that A • F = G. 

Assume from now on that A < 0. This is not because the case A > 0 is uninterest¬ 
ing but because it is more difficult and less is known in this case. If the discriminant 
of F{X, Y) = aX'^ 4- bXY 4- cY^ is A then it is also so for -F. Note that A < 0 
implies that a and c have the same sign. We define S\ (A) to be the subset of A(A) 
consisting of those forms F for which a and c are positive, and A 2 (A) its comple¬ 
ment. Then F -F is a bijection from S\ (A) to A 2 (A). It is also easy to see that no 
member of S\ (A) can be equivalent to any member of A 2 (A). We restrict ourselves 
to A] (A). 

Definition. The form F{X, Y) = aX ^ + bXY + cY^ of 5| (A) is said to be almost 
reduced if \b\ < a < c. 

Theorem. Each equivalence class in S\ (A) has at least one almost reduced form. 

Proof. Consider an equivalence class with an element F{X,Y) = aX'^ 4 - bXY 
4 -cY'^ in it. If ^2 > c replace F by IF • F = Fi (say). Then F[(X,Y) = a\X'^ 
-t- c\Y^ with a\ = c and c\ = a, and so a\ < c\. Notice a > a\. If now 
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1^11 < ^ 71 , Fi is reduced. If not, find an integer n such that \bi + 2ain\ < a\. Replace 
F\ by F 2 = T" • F \. Then 

F 2 {X, Y) = ai{X + nYf + b\{X + nY)Y + C| Y^ 

= a\X^ + {b\ + 2a]n)XY + + b[n + c\)Y^ 

= aiX'^ + b2XY + c2Y'^ 

(say), with \b 2 \ < 02 and 02 = a\. But now 02 may not be less than or equal to C 2 . If 
so, again apply W and continue as before. After a finite number of steps we get an 
almost reduced form (finite because a> a\ > a 2 > • • • > 0). 

Corollary. The number of equivalence classes in ^1 (A) is finite. 

Proof. It suffices to show that the number of almost reduced forms is finite. If 
aX'^ + bXY + cY^ is almost reduced of discriminant A then 

a < c ^ < 4ac = b^ — A < — A. 

Hence 3a < |A|. Since a is a positive integer, there are only finitely many possible 
values of a and hence of b. Once a and b are given, c is uniquely determined. 

A natural question to ask is: is there precisely one almost reduced form in each 
equivalence class? The answer is—almost but not quite. 

We know that X'^ + non-real roots (because A < 0), say r and 

f. One of these (say r) will lie in the upper half plane {x + iy : x,y £ S,y > 0}. 

Hence F{X, Y) = aX'^ + bXY + cY- = a(X + rY){X + tY) with i = a(r + f) 
and c = arf. Hence to say that F is almost reduced is equivalent to saying that 

|t -I- f| < 1 and rf > 1, i.e., r € S where S is the region shown in Figure 9.1 

(including the boundary). 



Notice that if r is on the left vertical boundary of S then r -f 1 is on the right 
vertical boundary which is also in S. Similarly, if r is on the curve at Y then ^ is at 
Y'. In view of this we make the following definition: 

Definition. We say that F{X, Y) = aX^ + bXY + cY^ is reduced if the corre¬ 
sponding r G S but T ^ the left boundary of S, i.e., the left vertical boundary and 
curve Y. This is equivalent to saying that |6| < a < c, and in case a = \b\ then 6 > 0, 
and in case a = c then b >0. We now have the expected theorem. 






46 Number Theory 


A 

possible a 

possible b, c 

reduced forms 

KA) 

-3 

a = 1 

b = 1 , c = 1 

X'^ + XY + Y^ 

1 

-4 

a = 1 

b = 0 ,c=l 

X^ + Y^ 

1 

-7 

a =■ \ 

b= l,c = 2 

X^ + XY + 2 Y^ 

1 

-8 

a = 1 

b = 0 ,c = 2 

X^ + 2 Y^ 

1 

-11 

a = 1 

b= l,c = 3 

X'^ + XY + 3Y^ 

1 

-12 

a = \ or 2 

b = 0,c = 3\fa=\ 

X^ + 3Y^ 

2 



6 = 2, c = 2 if <3 = 2 

2(X^ + XY + Y^) 


-15 

a = 1 or 2 

b = l,c = 4if<3 = l 

X^ + XY + 4Y^ 

2 



b = l,c = 2 iffl = 2 

2X^ + XY + 2Y'^ 


-16 

<3 = 1 or 2 

6 = 0, c = 4if<3 = 1 

X 2 + 4y2 

2 



6 = 0,c = 2if(3 = 2 

2 (X^ + Y^) 


-19 

<3 = 1 or 2 

6 =l,c = 5iffl=l 

X^ + XY + 5Y^ 

1 

-20 

<3 = 1 or 2 

6 = 0 , c = 5iffl=l 

X^ + 5Y^ 

2 



6 = 2, c = 3 if a = 2 

2X^ + 2XY + 3Y^ 


-23 

= 1 or 2 

6 = l,c = 6 ifa = 1 

X^ + XY + 6Y^ 

3 



6 = l,c = 3 if a = 2 

2X^ + XY + 3Y^ 




6 = —l,c = 3 if a = 2 

2X^ -XY + 37^ 



Theorem. In each equivalence class of 6 ’i (A) there is precisely one reduced form. 
The chart gives a list of reduced forms for low values of |A| is shown; h{A) is the 
number of reduced forms. 

We notice that some forms in the list are constant multiples of forms which came 
earlier in the list. 

Definition. A form aX^ + bXY + cY^ is said to be primitive if {a, b,c) = 1. 

We let h{A) be the number of primitive reduced forms of discriminant A in (A). 
/i(A) is known as the class number of the forms with discriminant A. Notice that 
h{A) is 1 for A = -3, -4, -7, - 8 , -11, -12, -16, -19 in the list. 

Definition. An integer A = 0 or 1 (mod 4) is said to be a fundamental discriminant 
if it is not of the form Aon^ where Aq is a discriminant and n an integer greater than 1 . 

For instance, -12 and —16 are not fundamental discriminants. Notice that if A is 
fundamental, then a form of discriminant A is always primitive. Notice also that if A 
is fundamental, then it cannot have an odd square factor. We will see later that if A 
is fundamental then it has another interpretation. 

In 1934, Heilbronn showed that h(A) oo as A -oc from which it follows 
(how?) that given any natural number N there are only a finite number of negative 
fundamental discriminants A for which the class number, h{A) = N. One of the 
questions that suggests itself from the above is: what are the negative fundamental 
A for which h{A) is 1? Above we have given six such A’s. Here are three more: 
A = —43, —67, —163. In 1800, Gauss conjectured that there were no more. 
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In 1936, Siegel showed that for every e > 0 there exists a positive constant Q 

such that /t(A) > C^|A |2 . However, the result showed the existence of but not 

how to compute it. His proof showed that there cannot be two ‘large’ values of |A|’s 
for which /t(A) is small. From this it was proved that there is possibly just one other 
A (call it Aio) for which //(A) = 1 and this A must be very large indeed. In 1966, 
Harold Stark, in his thesis, showed that Am does not exist*. The same methods were 
applied to the negative A for which h{IN) = 2 and it was found that there are 18 such 
A’s, the largest value of |A| being 427 (Baker, Stark, Montgomery etc). In 1986, 
using powerful methods in algebraic geometry, D Goldfeld, E H Gross and D Zagier 
solved the problem of fundamental negative A with h{h) = 3. 

Remark. The A for which /?(A) = 1 have remarkable properties. For instance, if 
p is a positive prime number which is congruent to 3(mod 4) and h{-p) = 1 then 

+ X + is a prime number for all x such that 0 < x < 


Suggested Reading 

[ 1 ] H M Stark. The complete determination of the complex quadratic fields of Class 
number one. Michigan Math F. 14. 1-27, 1967. 

[2] J P Serre. A course in Arithmetic. Narosa Publishing House. New Delhi, 1979. 

[3] D Flath. Introduction to Number Theory. John Wiley and Sons. New York, 
1989. 
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' In 1954, an amateur mathematician Heegner, in Germany, had proved the same result but his proof 
had some gaps which were responsible for mathematicians expressing reservations about the proof. But 
later it was shown by Stark that the arguments of Heegner can be made rigorous and he managed to 
make Heegner’s proof work. In fact, Heegner’s ideas, in particular his construction of what are now called 
Heegner points, have proved to be very fruitful in later work on elliptic curves. 
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The Class Number Problem 

An Introduction to Algebraic Number Theory 

Raj at Tandon 

This chapter gives an introduction to ‘algebraic number theory’, defines class 
numbers for finite extensions of the field of rational numbers and proves that 
in the context of quadratic fields, this definition coincides with the definition of 
class numbers via binary quadratic forms given in the previous chapter. 

We have seen in the previous chapter that some seemingly innocuous questions start¬ 
ing with the formula ^ ^2a fairly deep mathematics. This is typical of 

the subject. It is so important to ask the right question—“ ask an impertinent question 
and you get a pertinent answer ”. 

The roots of aX'^ -f bX + c = 0 are given by where A = {b^ — 4cc), 

i.e., they are of the form x + yVA with x and y rational. The set Q( VA) of elements 
of the form x + yVA with x and y rational, forms a subfield of the field of complex 
numbers, C • Q(\/A) is also a vector space over the rationals if we define scalar mul¬ 
tiplication by X{x + yVX) = Ax + Xy^/A. { 1 , VA} is a basis of Q(VA) over Q, and 
Q is a subfield of Q(VA). This process can easily be generalised. For instance, let 
p be a prime and ( = , Let Q(^) be the set of complex numbers of the form 

^0 + '■^ 1 C + + • • • + Xp- 2 C^~^ with X/ rational. Note that 1 + ^ 4- + • * * 

= 0 so i^P~^ can be written in terms of 1, C, • • •, Check that Q(C) 
is a subfield of C containing Q and that 1 , is a basis of Q(^) over 

Q with scalar multiplication being defined in the obvious way. These are examples 
of fields containing Q which are finite dimensional as vector spaces over Q. Such 
fields are known as algebraic number fields and were the object of detailed study by 
Dedekind, Kronecker and Kummer in the 19th century. Amongst the several motiva¬ 
tions for studying such fields were three problems suggested by Greek geometers: 

(i) To trisect any given angle. 

(ii) To construct a cube whose volume is twice that of a given cube. 

(iii) To construct a square equal in area to a given circle. 
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These constructions were to be done by ‘ruler and compass only’ in the manner 
that we are taught at school. The second problem boils down to being able to con¬ 
struct by ruler and compass the real root of X'^ - 2. Galois and Abel looked at such 
problems and their work gave a huge impetus to the systematisation of algebra and 
algebraic number theory. 

The examples given above, Q(\/A) and Q(C), have been generated by single ele¬ 
ments (\/A and Q which satisfy some polynomial with rational (in fact, integral) 
coefficients {X^ — A, X^ — 1 respectively). Indeed, it can be shown that any subfield 
of C containing Q which is n-dimensional as a vector space over Q consists of ele¬ 
ments of the form xq + x i a -f X 2 a^ + • • • - 1 - Xn -1 ^ where the x/ are rationals and a 

is a complex number which satisfies a' polynomial equation of degree n with rational 
coefficients. 

The first thing we would want to know about such fields is whether they have a 
subring in them in much the same way that Q contains Z and every element of Q 
is a ratio of two (one non-zero) elements of Z. One ‘natural’ possibility in Q(\/A) 
could be Z -I- ZVA, i.e., elements of the form a + bVX with a and b integers or in 
other words Z-linear combinations of the basis 1 , VA. Similarly one could consider 

Z + Z^ + Z(^ -I- Z(^^ H-hZ(^^“^ in Q(C)- But immediately one would recognise 

a difficulty in basing a definition which depends on the choice of a basis. For instance, 
Q(\/A) = Q(V4A) but Z + ZVA 7 ^ Z -f ZV4A or observe that if p = 3 then 
^ _ ^2nif3 _ r ^ so Q(C) = Q(\^^) but Z + Z^ Z + ZV^. To get around 

the problem of square factors of A, we will henceforth assume that A is a fundamental 
discriminant. See previous chapter. Hence the only square factor A can have is 4. 

We have already seen that the fields above are generated by elements which satisfy 
a monic (leading coefficient 1 ) polynomial with rational coefficients. In fact, every 
element a + bVA in Q(\/A) satisfies the polynomial X'^ — 2aX - 1 - {a^ — b^A) = 0 . 
This suggests an alternative. Why not consider those elements of Q(\/A) ( or Q(C)) 
which satisfy a monic polynomial with coefficients in Z? Such elements are called 
algebraic integers (in the given field). Do such elements form a subring /, i.e., are 
they closed under addition and multiplication? The answer is ‘yes’. Observe that 
a -I- bVA will be an element of the given type provided 2a G Z and a^ — b^A G Z. 
Suppose then that a + bVA and c -I- dVA are such that 2a, 2c G Z and a^ - b^A, - 
d^A G Z . Observe that 2(<3 - 1 - c) € Z and {a - 1 - c)^ - (^ + d)^A = {a'^ - b^A) + (c^ - 
d^A) -f- 2ac - 2bdA. We say that a rational number is a half integer if it is of the form 
//2, where / is odd. We make the following observations which can easily be proved 
by the reader: for a,b G Q, 2a and a^ - b^A are integers implies 

(i) 2b gZ since A has no square free factor other than possibly 4; 

(ii) if A is even, then a must be an integer and b either an integer or half integer; 

(iii) if A is odd, a and b must be either both integers or both half integers. 

In all cases it can then be seen that if 2a, 2c G Z and a^ — b^A, — d^A G Z then 
2ac - 2hdA G Z and therefore that (a + c)^ - (b-h d)^A G Z. On the other hand, 

(a + b\fA) • (c + d^/A) = ac + bdA -f (ad -I- bc)\fA and, 

{ac + bdA)^ - {ad + bc)^A = {a~ - b^ A) • (c^ - d'^A) 

are both in Z. Hence 1 is indeed closed under addition and multiplication. 
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Exercise. Show that 

(i) in Q(v^^) we have / = {a + a,b eZ] 

(ii) in Q(V^), / = ( ° + »-^ |Q,fe e Z,a = b (mod 2)} = Z + Zf, where 
^ = ~ i is a cube root of unity. 

Would every element of Q(\/A) be a ratio of two elements of /? We note that Z + 

ZVA C I and | ^ VA = ^ > so this is trivially true. What other properties 

of Z would we like / to have? The best would be unique factorisation. In Z we have 
the notion of a prime number and we know that every number can be written upto 
sign uniquely as a product of distinct prime powers, viz, 

n — ±p j P 2 '' ' Pr 

f\ f2 f 

where the p/ are distinct primes and, moreover, if n is also equal to ±q'^ ^2 • • • 
then after changing the order of the ^/’s, if necessary, we have r = s, pi = qi and 
e/ = fi for all /. 

Imagine the usefulness of having such a property in /. For instance, consider Q(C) 
as above and the ring of integers I in Q(C), i-e., the set of all elements in Q(C) which 
satisfy a monic polynomial in Z[X], the ring of polynomials in one variable with 
integer coefficients. Suppose there exist non-zero integers x, y, z such that = 

Z^. Then, 


xP = zP -yP = {z- y)(^ - ^y){z - C^y). ^Az-(P V)- ( 1 ) 

It is easy to see that x e I and z — Cy ^ /. If we have unique factorisation in /, there 
is just a chance that ( 1 ) may give us a contradiction to unique factorisation (or allow 
us to use the method of descent) and we may prove Fermat’s^ last theorem! It is Just 
possible that Fermat had some such proof in mind when he wrote in the margin .... 

We would first need the notion of a prime element in I. This is accomplished more 
or less as in Z—negatives allowed. So we consider —2, —3, —5,... also as primes. 

Definition 1 . An integer m is a prime if whenever n is written as a product ab of 
two integers then either a or b must be ± 1 . Note that ±1 are the only units in Z, i.e., 
elements in Z with a multiplicative inverse. 

There is another way oTdefining a prime number. 

Definition 1'. An integer p 7^ ±1 is a prime if and only if whenever p divides a 
product of integers ab then p must divide either a or b. 


’ Incidentally, this is what Gauss had to say kbout FLT. “I confess that Fermat’s theorem as an isolated 
proposition has very little interest for me because I could easily lay down a multitude of such propositions 
which one could neither prove nor dispose off.” Gauss said that FLT had induced him to recall some of his 
earlier ideas in higher arithmetic but that he was not in a position to go back to that work because of his 
circumstances. “Still I am convinced that if I am as lucky as I dare hope and if I succeed in taking some 
of the principal steps in that theory, then Fermat’s theorem will appear as only one of the least interesting 
corollaries.” 
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Recall that if n is an integer then nZ, the set of multiples of n, forms an ideal in 
Z (an ideal / in a commutative ring R is an additive subgroup of R which has the 
property: x e J, r G R implies rx e J). If an ideal / in a ring satisfies the property: 
ab G I implies either a G / or ^ G / it is called a prime ideal. So saying that the 
integer p is a prime number is the same as saying that pZ is a prime ideal in Z. It is 
easy to see that the two definitions we have given are equivalent in Z. 

Based on the above, we could define in an arbitrary commutative ring with unity R 
(all our rings will be so) an element tt to be prime either by requiring that whenever 
TT — ab, either a or b must be a unit in R, or by requiring that the ideal ttR, consisting 
of all multiples of tt, is a prime ideal. Unfortunately, in an arbitrary ring the two 
definitions are not equivalent. An element n which satisfies the first property is said 
to be irreducible whereas if kR is a prime ideal we call n a prime. In integral domains 
(commutative rings with no zero divisors) all primes are irreducible but not vice- 
versa. (Exercise: Prove this.) 

A domain in which every non-zero non-unit can be written as a product of irre- 
dUcibles in an essentially unique way, that is upto order and multiplication by units 
(6 = 2-3 = 3- 2 = (—2) • (-3) = (-3) • (-2)) is called a unique factorisation domain 
(UFD). Clearly, Z is a UFD and it is easy to check that / = Z -f- Z/ is also a UFD. 

Z has another property which is somewhat stronger—every ideal in Z is of the 
form nZ where n is an integer. A domain D which has the property that every ideal 
in it is of the form xD for some x in i) is called a principal ideal domain (PID) 
and every PID is a UFD. If we could show that the ring of integers I in an algebraic 
number field is always a PID then we could use the argument given above for FLT. 
Unfortunately, I is not always a PID. For instance, consider Q(V—20); then / = 
Z -H ZsT^ and we have 6 = 2 • 3 = (1 -f It is easy to check that 2, 

3, 1 ± are all irreducible elements in I. We remark that the ring of integers of 
an algebraic number field is a UFD if and only if it is a PID. 

Recall that if A and B are two ideals in a ring R then we define their product as 
A • B = {^ 2/=^ ^ ^ foi" some This is also an ideal. Though / 

is not always a PID it is true that every ideal in / can be written uniquely, except for 
order, as a product of prime ideals. This gives us the first hint that the concept of an 
ideal may be at least as important as the notion of an element. Note that in a PID 
the two notions are almost the same as every ideal is generated by a single element 
which is uniquely determined upto units. 

So if / is not always a PID then how ‘bad’ is it? The set I of ideals in / under 
the product defined above form a semigroup (7 itself is the identity). We define 
an equivalence relation on this set 1 as follows: ^ ~ .6 if there exist a,/! G I 
such that al ' A = fl ' B. \i is easy to check that this gives us an equivalence 
relation on I and the product on I induces a product on the set of equivalence classes 
7/~: [A] - [B] = [A • B], The crucial point here is to check that as defined 
above is well defined, i.e., if ^ ~ and B ^ B' then A ' B A' ■ B'. The set 
of equivalence classes I/~ with this product is actually a group. It is one of the 
fundamental theorems of algebraic number theory that this group is finite—not just 
for quadratic extensions of Q but for any finite extension of Q. The order of this 
group is called the class number of the extension. The class number of Q(\/A) will 
be denoted by h'{A). Note that the class number is one if and only if 7 is a PID. 
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Now let A be a negative fundamental discriminant, i.e., a negative integer A which 
is congruent to 0 or 1 modulus 4 and which cannot be written in the form Aq/^^ where 
Ao is another discriminant and n is an integer greater than 1. Hence 4 is the only 
possible square factor of A. Recall that we have defined h{A.) to be the number of 
equivalence classes of primitive binary integral quadratic forms. Remarkably: 

Theorem. h{A.) = h'(A). In order to prove this we first observe that if a = a-\- bVA 
is in the ring of integers I of Q(VA) then so also is o: = a? — by/~K. Hence so also 
is aa which is an integer. Hence if A is any non-zero ideal of I then A oX (0). 
Clearly ^ fi Z is an ideal in Z so ^ fi Z = for some integer a > 0. Observe also 
that any non-zero ideal of I cannot be contained in Z. 

In order to make life a bit easier, we will assume in what follows that A is odd and 
hence that 7 = Z + Z[(l - 1 - VA)/2] (proof?). Let A be an ideal in I. Define 


J = 


r € Zlr 


i + yx 


+ s £ A 


for some s e Z. Then J is an ideal in Z and since ^ g Z, / is non-zero. Let J = tZ, 
t > 0 . Then there exists an 5 G Z such that r[(l + VA)/ 2 ] + 5 g We claim that 
A = aZ ^ [{t 2s + r\/A)/2]Z. Clearly, the right-hand side is contained in A. Let 
a = n - 1 - v[(l -}- VA)/2] g A. Then v G 7 so V = tv' for some v' G Z. Therefore 

,(t + 2s) + t\/A ,\ + ^/A ,{t-\-2s) + t^/A 

a -V - - - = u-\-tv - - -V- - 

= u — sv'£AC\Z = aZ. 


Therefore, a e aZ[(t + 2s + tVA)/ 2 ]Z. Hence, every ideal A in I is of the form 
aZ + [{b + c\/A)/ 2]Z, a > 0, c > 0. For this to be an ideal, it must be closed under 
multiplication by (1 - 1 - y/~K)/2. Hence a[{l + s/A)/2] ^ aZ + [{b + csTK)I2\Z, i.e., 
there exist integers m, n such that a[{ \ 4 - VA)/ 2 ] = ma -1- n[{b + c\/A)/ 2 ] => a = nc 
and 1 = 2m + - i.e., c divides a, c divides b and ^ is odd. Let a = tc, b = uc, u odd. 

Then aZ -f [(7 V c\/A)/2]Z = tcZ + [{uc -f c\/A)/2]Z = c[tZ 4 - [{u 4- \/A)/2]Z]. 
Hence, every ideal in 7 is of the form c[tZ+[(u+VA)/2]Z], with c > 0 , t > 0 and u 
odd. Further, since ^ is closed under multiplication by (l4-\/A)/2, c[n-t-\/A)/2][(l 4 - 
\/A)/2] g a. Hence there exist integers /?, k such that [{u 4 - A) 4 - (1 + n)\/A]/4 = 

ht 4 - k[{u 4 - VA)/ 2 ]. Therefore, k = and = ht + ^ = ht [n(l 4- n)/4]. 

Hence A = 4 - Aht, We have proved: 

Proposition. Every ideal in 7 is of the form t[aZ 4- {(/> 4- VA)/ 2 }Z] for some 
integers a, 7, t with f > 0, a > 0 and such that there exists an integer c with A .= 
b^ — 4ac. 


Proof of the Theorem. We denote by [aX^ 4- bXY 4- cY^] the equivalence class 
of the form aX^ H- bXY 4 - cY^ in (A). We denote by [A] the equivalence class of 
the ideal A in 7. Define 


£■: Ai(A)/- —^ 1/ 
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[aX^ + bXY + cY^] 




Then the proposition we have proved above shows that e is subjective. We need, of 
course, to show that e is well defined. For this we must show that if 


A ■ (aX^ + bXY + cY^) = a'X^ + b'XY + c'Y^ 


Zj ~ a!Z + 


z>' + \/A 


Z 


where A is either ^ ^ j ) q ) ^^en aZ + 

If i4 = ^ Q J ^ then a' = a and b''= b + 2a which implies that aZ + ^ t .x 
= a'Z+ [{b'+ VA}/2]Z. 

If A = ^ ^ then d — c and b' — —b so 


. b' + VA^_^^ , -b3-VA^ _b^-A^ , -b+VX^ 

a Z - - -Z — cZ "f"---Z — — - -Z “f"- - -Z. 

2 2 4a 2 


Therefore, a(^dZ + ^^y^^.^aZ + — j^ Z^ and we have proved 

what was required. 

In order to prove our theorem we must show that c is a bijection. Only the injec¬ 
tivity of € is left. Before proving injectivity we make two remarks: 

(a) If A and B are two ideals in I then they are equivalent if there exists a,p e I 
such that a. A = p.B. But this is equivalent to aa^ = apB and aa is a positive 
integer. Hence A ^ B \i and only if there exists an integer t > 0 and /? G'/ 
such that t' A = P ‘ B. 

(b) If a, /? G / and aZ PZ = yZ SZ then there exists an integral 2x2 matrix 
A of determinant ±1 such that A 

Now suppose that 

e([aX^ + bXY + cY^]) 


a 


ir 


= e{[a'X^ + b'XY+ c'Y'^]), 


/ v2l 


i.e.. 


QJu “f“ - JLt ^ Q JLt “\r -- Zj. 


- P + in I such that a • (aZ + 


Hence there exists an integer t' > 0 and a 
h+ . ^A z) = t' • (a'Z -}- 2 ^ Z) = ^(say). We must show that 

dX^ + b'XY -f c'Y^ = A • {aX^ + bXY + cT^) 


for some A in SL(2, Z). 
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Case 1: Let ^ = 0 and r = p/2. Then atZ = a't'Z = AdZ. We may without loss of 
generality assume that at = a't' and hence t > 0. There exist integers m, n such that 
r[( 6 +VA)/ 2 ] = ma't'-{-nt'[{b'+VA)/ 2 ] which implies that r = nt' and hence a' = na. 
There also exist integers k, I such that /'[(/?' + VA)/ 2 ] = kta + lt[{b + VA)/ 2 ]. 
Hence In = \ or n = \ A = t', a = a' said b' = b 4- 2ak. It is now easy to see that 


a'X'^ + b'XY + c'y^ 



. (aX^ + bXY + cY^). 


Case 2: (q 0 ). In view of case 1 we may assume that (p, q) = 1. By the proposi¬ 
tion above and remark (b) there exists an integral matrix A = (^ ^ determi- 

nant ±1 such that 


A • 






a 


p + q 


Va 


6+ VA^ 


p + q 


VA 


\ 


/ 


t'a' 


~ \ p y + VA ) ’ 


-1 0 


or, in fact, by multiplying by the matrix 1 necessary, we can assume that 


A is in SL(2, Z) and 

/ 

A • 

V 


a 




VA 


b + 


va 


p + q 


VA 


\ 


/ 


±t'a' 


“ \ p . b\+VA 


( 2 ) 


Therefore, xa[(p + qVA)/2] -f y[{{bp -j- qA) + {p + bq)VA} /4] = ±t'a' which 
implies that xa{p/2) + yl{bp -I- qA)/4-] = ±t'a' and xa{q/2) + y[{p + bq)/A\ — 0. 
Hence, 2xaq = —y{p + bq). Let e be the positive g.c.d. of 2a and p -H bq. Then 

x[{ 2 aq)/e\ = -y[(p + bq)/e\, so ^ divides y and y = ^ .r for some integer r. 
Then x = -r[(p + bq) / e]. Since (x, y) = 1 we get r = ± 1. A simple calculation now 
shows that xa(p/2)-fy[(6pH-(7A)/4] = ±t'a' = - ^aa. Hence, keeping in view the 
various signs, we get t'a' = {2a/e)aa. Furthermore, since xw — y:^ = 1, substituting 
the values of x and y given above, we get w(p -I- bq) + 2aqz = -re. We further 
get from (2) that 2 a[(p -f qVA)/ 2 ] H- w[{(b A VA)/2) ((p A qVA)/ 2 }] = t'[{b' A 
VA)/2] which implies that 2zap -f w(bp A qA) = 2t'b' and 2zaq A w(p -1- bq) = 2t', 
i.e., -re = 2t'. Hence 2zap A w{bp A qA) = —reb'. It is now easy to check that 
A^.(aX^ -f bXY + cY^) = a'X^ -I- b'XY + c'Y^. For instance, the coefficient of 
if we replace X by xA A zY and Y by yA A wY in the expression aX^ A bXY A cY^, 

is ax^ A bxy A cy^. Substituting x = —r[(p A bq)/e] and y = • r and using the 

fact that t'a' = {2a/e)aa and 2t' = —re, we get ax^ -I- bxy A cy^ = a'. Similarly, the 
coefficient of AT on the required transformation is 2axz A bxw A byz + 2cyw which 
on substitution is just b'. Therefore A^ • (aX^ -l- bXY -f cY^) = a'X^ A b'XY A c'Y^ 
and € is injective. 

This is a beautiful example in mathematics where two apparently unrelated objects 
turn out to be equal. Maybe the reader can discover some more. 














The Class Number Problem 55 


Suggested Reading 

[1] D Flath. Introduction to Number Theoty. John Wiley and Sons. New York, 
1989. 

[2] J P SeiTe. A Course in Arithmetic. Narosa Publishing House. New Delhi, 1979. 

[3] H M Stark. The complete determination of the complex quadratic fields of class 
number one. Michigan Math F. 14. 1-27, 1967. 

[4] Algebraic Number Theory. Mathematical Pamphlets 4. Tata Institute of Funda¬ 
mental Research. Mumbai, 1964. 


Rajat Tandon 
Department of Mathematics 
University of Hyderabad 
Central University P.O. 
Hyderabad 500 046 


11 


Roots are Not Contained in Cyclotomic 

Fields 


Raj at Tandon 

The square root of any integer is contained in a cyclotomic field, i.e., an extension 
field Q(C«) of Q generated by i^n — There is a famous theorem of Kronecker 

and Weber (see the remarks at the ei’id) which vastly generalises this fact. In what 
follows, if Qfi, a 2 ,..., an are complex numbers, we denote by Q(ai , 02 ,.... an) the 
smallest subfield of C containing the a'-s. As in the case of Fermat’s last theorem 
(FLT, where x" + = z” has integer solutions only in the case n = 2), the surpris¬ 

ing fact is that other nth roots (other than square roots) are never contained in a cyclo¬ 
tomic extension. Of course, one must exercise a little care. For instance ^ = V2 is 
a square root and hence contained in a cyclotomic extension. The point here is that 
^ is not a genuine fourth root; it is, in fact, a square root. 

Definition 1 . If a is an integer greater than 1 then the real number l/d is said to be 
a genuine nth root if it cannot be written in the form Vb for some integer b and some 
m < n. 

In particular, a genuine Azth root for > 1 is irrational; for if it is rational, then it is 
of the form for some integer b with m = \. We have the following theorem: 

Theorem 2 . Let a be any integer. Then, ^ is contained in a cyclotomic field. If 
^ is a genuine nth root where a is an integer greater than 1 and n an integer greater 
than 2, then ifa is not contained in any cyclotomic field. 

The first assertion is very well-known and is easy to establish. While proving it, one 
actually proves a stronger statement viz.. 

Proposition 3. If p is a prime, then g Q(fp). 

Observe that if is genuine and a = p\^ p 2 P^r is the factorisation of a into 
distinct prime powers, then g.c.d.(ci, C 2 , • • •, «) = 1. For, if r = (c], C 2 , • • •, C;-, n) 
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and b = ^ . p/ ^ then Tfd = Vi). Thus, if n has an odd prime factor p, 

in order to show that is not contained in any cyclotornic extension it suffices to 
show that is not contained in a cyclotornic extension. On the other 

hand, if /i = 2'*, r > 2, it suffices to show that (^)"/^ = is not contained in any 
cyclotornic extension, i.e., it suffices (as in the case of FLT) to prove our theorem for 
« = 4 or p where p is any odd prime. 

The proof follows from the following propositions which can be found in any 
standard text on Galois theory (see, for instance [1]). We will also refer to the article 
[2] on Galois theory by B Sury which appeared in Resonance. In what follows, K 
and F will always denote subfields of C, and if K is any such field we denote by 
G{K) the group of automorphisms of K. [K : F] denotes the dimension of K as a 
vector space over F. 

Proposition4. If F c c L then [L : F] = [L: K][K : F]. 

It is easy to see that if a/’s form a basis of K over F and Pj's form a basis of L over 
K, then the a/^y’s form a basis of L over F. 

% 

Proposition 5. [F(a) : F] is equal to the degree of the unique monic polynomial 

fa of minimal degree in F[X] satisfied by a, and this is the same as the degree of 
any irreducible polynomial in F[X] satisfied by a. 

(See lemma in [2].) It can easily be seen by using the Euclidean algorithm for polyno¬ 
mials that fa divides any polynomial in F[X] that has a as a root and hence divides 
any irreducible polynomial g satisfied by a. Irreducibility of g implies that g = cfa 
for some constant c in F. 

Proposition 6 . The group G(Q(C/n)) of automorphisms of the field Q(Cm) for any 
m > 2 is abelian; in fact, it is isomorphic to the group of units in the ring Z/mZ. 

It is clear that if <j is an automorphism of Q(Cm) then since = 1 we get 
= 1 so (T(^m) is another mth root of 1. Since an automorphism of a group preserves 
order and a is an automorphism of the multiplicative group (Q(Cm) “ (0))» 
has order m so = Cm for some / coprime to m. We thus have a map a i 

from G((P(Cm)) to the group of units in Z/wZ. That the map is a homomorphism is a 
simple exercise. It is clear that cr is completely determined by its action on Cm since 
generates <P(Cm)- Hence the map is injective. That the map is surjective follows 
from Proposition 8. 

Proposition 7. If Q C F C K where F and K are each generated over Q by the 
roots of some polynomials in Q[2f], i.e., F and K are splitting fields of polynomi¬ 
als in Q[2f] (see [2]), then G{F) is isomorphic to G{K)/G{K/F) where G{K/F) 
denotes the subgroup of G{K) consisting of those automorphisms of K which fix the 
elements of F. Hence if G{K) is abelian, so is G{F). 

This follows easily if we consider the restriction map from G{K) to G{F). The fact 
that if (j G G(K), then <j(F) = F follows from the fact that F is normal over Q 
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(refer [2], Box 14). For, suppose F = Q(ai, a2, • • •,««), where the a/’s are roots 
of some polynomial /(x) in Q[2f]. Since /(a/) = 0 we have cr(/(a/)) = 0. But 
cr(/(a/)) = /(cr(a/)), so cr(a/) must be another root of /, i.e., a(a/) = aj for some 
y; <7 permutes the roots of / and so c>{F) = F. 

Proposition 8 . If is generated over F by the roots of some polynomial in F[X] 
and a, a' are two roots in K of an irreducible polynomial in F[X], then there exists 
an automorphism a in G{K/F) such that cr(a) = a'. 

We have an isomorphism (just the substitution map) from to F{a) which maps 

X + if) to a and similarly an isomorphism from to -F(a') which maps X + (/) 

to a'. Hence we have an isomorphism from F{a) to F(a') which maps a to a'. This 
map extends to an automorphism of K (see Proposition 5.2 in [1]). 


Proposition 9. If p is an odd prime or 4 and if ^ is genuine with a > \ then 
GiTfa, T^p) is not abelian. 


If p is an odd prime, Q((^p) is the field generated over Q by the roots of the polynomial 

I + X X^-\ - \-XP~K If p is an odd prime or 4 and a > 1, then Q(^, C/?) is the 

field generated over Q by the roots of X^ - a. Both these polynomials are irreducible 
over Q. Hence 


[Q(4) : Q] = 


(p — I if p is odd 
\2 if p = 4. 


and [Q('^) : Q] = p. Hence by Proposition 4 


[Q(^.4):Q] 


(pip “1) if p is odd 
^8 if p = 4. 


It follows again ty Proposition 4 that [Q(^,Cp) : Q(Cp)] = P and [Qi^,Cp) : 
QC-^)] = p - 1 or 2 according as p is odd or 4, respectively. Hence by Proposition 5, 
XP — a is irreducible over Q(Cp) and 1 + X + X^F • • • fXP~^ is irreducible over 
Q(i^) if p is odd whilst + 1 is irreducible over Q(-v^). Observe that and 
are roots of X^ - a and and fp are roots of 1 + X+ • • • +X^“^ if p is odd 

whereas ^p and are roots of X^ + 1 if p = 4. Hence by Proposition 8 there exists 
an automorphism a € G(Q(^, Cp) : Q(fp)) (i.e. a fixes Cp) such that ai^l = ^Cp 
and there exists are G(Q(^, Cp) : Q(^)) (i.e. r fixes ^) such that T(fp) = Cp if 
p is odd and T(Cp) = Cp if P = 4. Hence, 

(7t(^) = a(^) = 
whereas -• 

Ta(^) = xiral^p) = T{<fZ)T{t;p) = P 

if p = 4. 

In either case at to and (p)) is not abelian. 

Observe that if Q(^) c ^i^), then Q«/H,C«) C Q(Cm,Cfi) = Q(f[m,«]) where 
[nifn] is the l.c.m. of m and n. If Q(^, fp) was contained in the cyclotomic extension 
Q(Cm)» its group of automorphisms would, by Proposition 7, be the quotient of the 
abelian group G(Q(Cm)), and hence abelian. 




Roots are Not Contained in Cyclotomic Fields 59 


Remarks 

We have apparently proved the stronger result that for a genuine pth root ^ (with p 
an odd prime and <3 > 1), the Galois extension field generated by it is not an abelian 
extension of Q. However, this is not really a stronger statement. The deep theorem of 
Kronecker and Weber referred to in the introduction says that any abelian extension 
of Q is contained in a cyclotomic extension. The interesting question is whether one 
can similarly obtain the abelian extensions of any algebraic number field by adjoining 
special values of transcendental functions. For imaginary quadratic fields 
this has been solved using the so-called theory of complex multiplication. Roughly, 
the role of the function is taken by the elliptic modular y-function and the 
values are considered at points of finite order on the elliptic curves (in place of the 
circle as was in the case of the Kronecker-Weber theorem). The general question is 
known as Kronecker’s ‘jugendtraum’ (the german word means ‘dream of youth’) and 
is still open. It is one of the famous ‘Hilbert problems’ (the 12th problem). Hilbert 
writes in his 1900 address at the International Congress of Mathematicians that the 
extension of Kronecker’s theorem to any algebraic number field seems to him to be 
of the greatest importance and that he regards this as one of the most profound and 
far-reaching problems in the theory of numbers. 
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Die ganzen zahlen hat Gott gemacht 

Polynomials with Integer Values 

B Sury 

A quote attributed to the famous mathematician L Kronecker is 'Die ganzen zahlen 
hat Gott gemacht, alles andere ist menschenwerkl A translation might be 'God gave 
us integers and all else is man’s work.’ All of us are familiar already from middle 
school with the similarities between the set of integers and the set of all polynomials 
in one variable. A paradigm of this is the Euclidean (division) algorithm. However, 
it requires an astute observer to notice that one has to deal with polynomials with 
real or rational coefficients rather than just integer coefficients for a strict analogy. 
There are also some apparent dissimilarities—for instance, there is no notion among 
integers corresponding to the derivative of a polynomial. In this discussion, we shall 
consider polynomials with integer coefficients. Of course a complete study of this 
encompasses the whole subject of algebraic number theory, one might say. For the 
most of this paper (in fact, with the exception of Lemma 5, Lemma 7 and Exercise 3), 
we adhere to fairly elementary methods and address a number of rather natural ques¬ 
tions. To give a prelude, one such question might be “if an integral polynomial takes 
only values which are perfect squares, then must it be the square of a polynomial?” 

n)~ 1 )-~i^^^ takes 

integer values at all integers although it does not have integer coefficients. By Z, we 
shall denote the set of integers. 


Prime Values and Irreducibility 

The first observation about polynomials taking integral values is 


Lemma 1. A polynomial P takes Z to Z if, and only if P{X) = ao + a] 
-l- • • • -f" ^ with aj G. Z. 
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Proof. The sufficiency is evident. For the converse, we first note that any polyno¬ 
mial whatsoever can be written in this form for some n and some (possibly noninte¬ 
gral) fl's. Writing P in this form and assuming that P(Z) c Z, we have 


P(0) = ao G Z 
P( 1) = QQ -f- 1 G Z 



and so on. Inductively, since P(m) G Z Vw, we get a/ G Z V/. 

Corollary 1. If a polynomial P takes Z to Z and has degree /i, then n\P{X) G 
Z[X]. 

Lemma 2. A nonconstant integral polynomial P{X) cannot take only prime values. 

Proof. If all values are composite, then there is nothing to prove. So assume that 
P{a) = p for some integer a and prime p. Now, as P is nonconstant. 


lim \P{a + np)\ = oo. 


So, for big enough n, \P{a -I- np)\ > p. But P{a + np) = P(a) = 0 mod p, which 
shows P{a + np) is composite. 

Remark 1 . Infinitely many primes can occur as integral values of a polynomial. 
For example, if {a,b) = 1 , then the well-known (but deep) Dirichlet’s theorem on 
primes in progression shows that the polynomial aX - 1 - b takes infinitely many prime 
values. In general, it may be very difficult to decide whether a given polynomial 
takes infinitely many prime values. For instance, it is not known if Z + 1 represents 
infinitely many primes. In fact, there is no polynomial of degree > 2 which is known 
to take infinitely many prime values. 

Lemma 3. If P is a nonconstant, integral-valued polynomial, then the number of 
prime divisors of its values { P{m) } is infinite, i.e., not all terms of the sequence 
P(0), P(l),... can be built from finitely many primes. 

Proof. It is clear from Corollary 1 above that it is enough to prove this for P(Z) G 
Z[X], which we will henceforth assume. Now, P(Z) = ' where n > 1. If 

ao = 0, then clearly P(p) = 0 mod p for any prime p. If qq ^ 0, let us consider for 
any integer t the polynomial 



There exists some prime number p such that Q{m) = 0 mod p for some m and some 
prime p, because Q can take the values 0, 1, -1 only at finitely many points. Since 
Q{m) = 1 mod r, we have (p, 0=1- "Then P^a^tm) = 0 mod p. Since t was arbitrary, 
the set of p arising in this manner is infinite. 
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Remark 2. 

* (a) Note that it may be possible to construct infinitely many terms of the sequence 
{P{m)}mez using only a finite number of primes. For example, take {a,d) 
= \,a > d > \. Since, by Euler’s theorem, = 1 mod d, the numbers 

/ (^(^y)/!_1 \ 

^ - - e Z \f n. For the polynomial P{X) = dX + a, the infinitely many 

values — 1)) = have only prime factors coming from 

primes dividing a. 

(b) In order that the values of an integral polynomial P{X) be prime for infinitely 
many integers, P{X) must be irreducible over Z and of content 1. By content, 
we mean the greatest common divisor of the coefficients. 


Box 1. Eisenstein’s Criterion and More 

Perhaps the only general criterion known to check whether an integral poly¬ 
nomial of a special kind is irreducible is due to G Eisenstein, a student of 
Gauss and an outstanding mathematician, whom Gauss is said to have rated 
above himself. Eisenstein died when he was 27. 

Let f (Z) =ao + a\X + -- - + a^X^ be an integral polynomial satisfying the 
following property with respect to some prime p. The prime p divides oq, a\, 

..., an-\ but does not divide a^. Also, assume that does not divide oq. 
Then, f is irreducible. 

The proof is indeed very simple high school algebra. Suppose, if possible, 

that f{X) - g{X)h{X) = {bo + b\X -i - 1 - brX’‘){co + ci Z -j-h c^Z'^) 

with r,s > 1 . Comparing coefficients, one has 

ao-bocoy ci\ = aob\boa\,... ,an = brCs, rs = n. 

Since ao - bocQ = 0 mod p, either bo = 0 mod p or cq = 0 mod p. 

To fix notations, we may assume that = 0 mod p. Since ao ^ 0 mod 
p^, we must have cq ^ 0 mod p. Now a\ - boc\ + b\co = b\co mod p; 
so b\ =0 mod p. Proceeding inductively in this manner, it is clear that all 
the bi's are multiples of p. This is a manifest contradiction of the fact that 
an = brCs is not a multiple of p. This finishes the proof. 

It may be noted that one may reverse the roles of ao and an and obtain 
another version of the criterion: 

Let /(Z) = ^0 + <^lZ + • • • -f anX^ be an integral polynomial satisfying 
the following property with respect to some prime p. The prime p divides 
a\, a2y an but does not divide ao. Also, assume that p^ does not divide 
an. Then, f is irreducible. 

The following generalisation is similar to prove and is left as an exercise. 
Let f{X) = ao A a\X+ • - ‘ +anX^ be an integral polynomial satisfying the 
following property with respect to some prime p. Let t be such that the prime 
p divides aoy a\, ..., an-t but does not divide an. Also, assume that p^ does 
not divide ao. Then, f is either irreducible or has a nonconstant factor of 
degree less that t. 
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In general, it is difficult to decide whether a given integral polynomial is irre¬ 
ducible or not. We note that the irreducibility of P{X) and the condition that it have 
content 1 are not sufficient to ensure that P{X) takes infinitely many prime values. 
For instance, the polynomial X^ -f 105X -1- 12 is irreducible, by Eisenstein’s cri¬ 
terion (see Box 1). But, it cannot take any prime value because it takes only even 
values, and it does not take either of the values ±2 since both X'^ -f 105A" -f 10 and 
X^ -f 1052f + 14 are irreducible, again by Eisenstein’s criterion. 

Lemma 4. Let ai,..., be distinct integers. Then PiX) = {X—a\) • • • {X—an) — \ 
is irreducible. 

Proof. Suppose, if possible, P{X) = f{X)g{X) with deg. /, deg. g < n. Evi¬ 
dently, as f{ai)g{ai) = -1, /(a/) = -g{ai) = ±1 V 1 < / < n. Now, f{X) + g{X) 
being a polynomial of degree <n which vanishes at the n distinct integers, a\,... ,an 
must be identically zero. This gives P(X) = -f (X)^, but this is impossible as can 
be seen by comparing the coefficients of X^. 

Exercise 1. Let n be odd and ai,..., be distinct integers. Prove that {X—a\)-'- 
(X — afj) + 1 is irreducible. 

Let us consider the following situation. Suppose p = a,,.. .ao is a prime number 
expressed in the usual decimal system, i.e., p = qq 3 - lOai -f 100(32+• • •+10'*fl^, 
0 < (3/ < 9. Then, is the polynomial at) + a\X+ ■ ■ • +anX*^ irreducible? This is, in 
fact, true and, more generally. 

Lemma 5. Let P{X) € Z[X] and assume that there exists an integer n such that 

(i) the zeros of P lie in the half plane Re (z) < n — 

(ii) P(n- 1 ) 7 ^ 0 , 

(iii) P(n) is a prime number. 

Then P(X) is irreducible. 

Proof. Suppose, if possible P(X) = f{X)g{X) over Z. All the zeros of f{X) 
also lie in Re(z) < n - Therefore, \f{n - ^ - 01 < \f0^ “ 5 ■*" ^ Since 

f(n — 1) 7 ^ 0 and f{n - 1) is integral, we have |/(n - 1)| > 1. Thus |/(n)| > 
|/(« — 1) I > LA similar thing holding for g(Z), we get that P(n) has proper divisors 
f(n), g{n) which contradicts our hypothesis. 


Irreducibility and Congruence Modulo p 

For an integral polynomial to take the value zero at an integer or even to be reducible, 
it is clearly necessary that these properties hold modulo any integer m. Conversely, 
if PiX) has a root modulo any integer, it must itself have a root in Z. In fact, if 
PiX) G Z[X] has a linear factor modulo all but finitely many prime numbers, the 
PiX) itself has a linear factor. This fact can be proved only by deep methods, viz. 
using the so-called Cebotarev density theorem. On the other hand, (see Lemma 7) 
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it was first observed by Hilbert that the reducibility of a polynomial modulo every 
integer is not sufficient to guarantee its reducibility over Z. Regarding roots of a 
polynomial modulo a prime, there is following general result due to Lagrange: 

Lemma 6. Let /? be a prime number and let P{X) G Z[X] be of degree n. Assume 
that not all coefficients of P are multiples of p. Then the number of solutions mod p 
to P{X) = 0 mod p is, at the most, n. 

The proof is obvious using the division algorithm over Zjp.ln fact, the general result 
of this kind (provable by the division algorithm again) is that a nonzero polynomial 
over any field has at the most its degree number of roots. 

Remark 3. Since 1,2,... ,p — 1 are solutions to = 1 mod p, we have 
— 1 = {X — 1)(Z — 2)' " {X — {p — 1)) mod p. For odd p, putting Z = 0 gives 
Wilson’s theorem that (p — 1)! = -1 mod p. 

Note that we have observed earlier that any non-constant integral polynomial has 
a root modulo infinitely many primes. However, as first observed by Hilbert, the 
reducibility of a polynomial modulo every integer does not imply its reducibility 
over Z. For example, we have the following result: 

Lemma 7. Let p, q be odd prime numbers such that (^) •= = 1 and p = 

1 mod 8. Here (^) denotes the Legendre symbol defined to be 1 or -1 according as 

p is a square or not modulo q. Then, the polynomial P{X) = {X^ - P - q)^ - ^pq is 
irreducible, whereas it is reducible modulo any integer. 

Proof. 

P{X) = X^- 2{p + q)X^ + {p- 

= (X - Vp - V9)(X + Vp + vg)(A' -Vp+ Vq)iX + Vp - ^/9). 

Since ^/p, ^Jq, y/p ± yfq, yfpq are all irrational, none of the linear or quadratic factors 
of P(X) are in Z[X], i.e., P(X) is irreducible. Note that it is enough to show that a 
factorisation of P exists modulo any prime power as we can use Chinese reminder 
theorem to get a factorisation modulo a general integer. 

Now, P(X) can be written in the following ways: 

P{X) = A"* - 2(p + q)X^ + (p - qf 
= {X'^+p- qf - 4pX^ 

= (X^-p + qf - 4qX'^ 

= (X^ - P- qf - 4pq. 

The second and third equalities above show that P{X) is reducible modulo any p^ 
and any q^. Also since p = 1 mod 8, p is a square modulo any 2^ and the second 
equality above again shows that P{X) is the difference of two squares modulo 2^, 
and hence reducible mod 2". 
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If ^ is a prime 7^ 2, p, q, let us show now that P{X) is reducible modulo for 
any n. 

At least one of and (^) is 1 because, by the product formula for Legen¬ 

dre symbols, (|) • (|) • (y) = L According as (|), (|) or (y) = 1, the second, 
third or fourth equality shows that P{X) is reducible mod for any n. 

We end this section with a result of Schur whose proof is surprising and elegant as 
well. This is: 

ScHUR’S THEOREM. For any n, the truncated exponential polynomial En{X) = 
n\(^ \ + X + ^ + — ' + is irreducible over Z. 

Just for this proof, we need some nontrivial number theoretic facts. A reader unfa¬ 
miliar with these notions but who is prepared to accept at face value a couple of 
results can still appreciate the beauty of Schur’s proof. Here is where we have to take 
recourse to some very basic facts about prime decomposition in algebraic number 
fields. Suppose, if possible, that En{X) = f{X)g{X) for some nonconsant, irre¬ 
ducible integral polynomial /. Let us write f{X) = ao + a\X-\ - \-X^ (evidently, 

we may take the top coefficients of / to be 1). Start with any (complex) root a of / 
and look at the field K = Q(a) of all those complex numbers which can be written 
as polynomials in a with coefficients from Q. The basic fact that we will be using 
(without proof) is that any nonzero ideal in ‘the ring of integers of K' (i.e., the sub¬ 
ring Ok of K made up of those elements, which satisfy a monic integral polynomial) 
is uniquely a product of nonzero prime ideals and a prime ideal can occur at the most 
deg / times. This is a good replacement for K of the usual unique factorisation of 
natural numbers into prime numbers. The proof also uses a fact about prime numbers 
observed by Sylvester but is not trivial to prove. 

SYLVESTER’S THEOREM. If m > r, then (m+ l)(m-h2) • • • (m-hr) has a prime factor 
p > r. 

The special case m = r is known as Bertrand’s postulate. 

Proof of SCHUR’S theorem. Now, the proof uses the following fact which is inter¬ 
esting in its own right: 

Any prime dividing the constant term ao of / is less than the degree r of /. 

To see this, note first that N(a), the ‘norm of a’ (a name for the product of all the 
roots of the minimal polynomial / of a), is ao upto sign. So, there is a prime ideal 
P of Ok such that (a) = P^I, (p) = P^ J, where /, J are indivisible by P and k, 
/ > 1. Here, (a) and (p) denote, respectively, the ideal of Ok generated by a and p. 
Since En{oc) = 0, we have 

0 = «! -1- n\a -1- n\a^I2\ -F • • • -F 


We know that the exact power of p dividing n\ is 

hn = [f^/p] + [n/p^] + • • •. 
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Thus, in Ok, the ideal (n\) is divisible by and no higher power ot P . Similarly, 
for 1 < / < «, the ideal generated by «!«'//! is divisible by _ Because of 

the equality 

—n\ = n\a + n\a^ I2\+ • • • 

it follows that we cannot have each Ih^ — Ihi + ki strictly bigger than Ih^, which is 
the exact power of P dividing the left-hand side. Therefore, there is some i such that 
-Ihi + ki < 0. Thus, 

r 9 

i < ki < Ihi = KU/p] + U/Pi + ' ") < - 7 - 

P - 1 


Thus, p - 1 < / < r, i.e., p < r. This confirms the observation. 

To continue with the proof, we may clearly assume that the degree r of / is at most 
n/2. Now, we use Sylvester’s theorem to choose a prime q > r dividing the product 
n(n — 1) - (n — r + 1). Note that we can use this theorem because the smallest term 
n-r+I of this r-fold consecutive product is bigger than r as r < n/2. Note also that 
the observation tells us that q cannot divide gq. Now, we shall write En{X) modulo 
the prime q. By choice, q divides the coefficients of X‘ for 0 < i < n — r. 

So, f(.X)g(,X) = X" + + ---+n\ mod q. 

Write fiX) = flo + and g{X) = 6 o + 


The above congruence gives aobo = 0, aQb\ + aibo = 0 etc. mod q until the coeffi¬ 
cient of X^^~^ of f{X)g{X). As QQ ^ 0 mod q, we get recursively (this is just like 
the proof of Eisenstein’s criterion - see Box 1) that 


bo = b\ = • • • bn-r = 0 niod q. 


This is impossible as bn-r = 1* Thus, Schur’s assertion follows. 


Polynomials taking Square Values 

If an integral polynomial takes only values which are squares, is it true that the poly¬ 
nomial itself is a square of a polynomial? In this section, we will show that this, and 
more, is indeed true. 

j 

Lemma 8 . Let P{X) be a Z-valued polynomial which is irreducible. If P is not 
a constant, then there exist arbitarily large integers n such that P{n) = 0 and P{n) 
^ 0 mod for some prime p. 

Proof. First, suppose that P{X) e Z[X]. Since P is irreducible, P and P' have 
no common factors. Write /(Z)P(Z) -F g{X)P'{X) = 1 for some /, g G Z[X]. By 
Lemma 3 there is a prime p such that P{n) = 0 mod p, where n can be as large as we 
want. So, P'{n) ^ 0 mod p as f{n)P{n) = g{n)P'{n) = 1. Since P{n + p) - P{n) 
= P'{n) mod p^, either P{n -F p) or P{n) is ^ 0 mod p^. To prove the result for 
general P, one can replace P by m!P where m = deg P. 
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Lemma 9. Let P(X) be a Z-valued polynomial such that the zeros of smallest 
multiplicity have multiplicity m. Then, there exist arbitrarily large integers n such 
that P{n) = 0 mod P{n) ^ 0 mod for some prime p. 

Proof. Let P\{X),..., Pr{X) be the distinct irreducible factors of P(X). Write 
P{X) = P\ (Z)'”' • • • PriXy^'' with m = m\ < • • • By the above Lemma, one 
can find arbitrarily large n such that for some prime p, P\ (n) = 0 mod p, P\ (n) ^ 0 
mod p^ and, Pi{n) ^ 0 mod p for /' > 1. Then, P{n) = 0 mod p'” and ^ 0 mod 

Corollary 2. If PiX) takes at every integer, a value which is the kth power of an 
integer, then P{X) itself is the kth power of a polynomial. 

Proof. If P(X) is not an exact kth power, then one can write P{X) = / {X)^g(X) 
for polynomials /,g so that g{X) has a zero whose multiplicity is <k. Once again, 
we can choose n and a prime p such that g{n) = 0 mod p, ^ 0 mod p^. This contra¬ 
dicts the fact that P{n) is a kth power. 

[2] is an excellent source of results of this nature. 


Cyclotomic Polynomials 

These were referred to already in an earlier article ([1]). It was also shown there that 
one could use these polynomials to prove the existence of infinitely many primes 
congruent to 1 modulo n for any n. For a natural number d, recall that the cyclotomic 
polynomial ^g{X) is the iiTeducible, monic polynomial whose roots are the primitive 
dth roots of unity, i.e., ^d(X) = Y[a<d-{a,d)=\^^ ~ Note that <I>i(2f) = 

X - 1 and that for a prime p, d)p(Z) = XP~^+ • • • +X + 1. Observe that for any 
n>\,X^-\=lld/n^diX). 

Exercise 2. Prove that for any d , ) has integral coefficients, and is irreducible 
over Z. 

Factorising an integral polynomial into irreducible factors is far from easy. Even 
if we know the irreducible factors, it might be difficult to decide whether a given 
polynomial divides another given one. 

Exercises. 

(a) Given positive integers a\<---<an, consider the polynomials P{X) 

- ~ Qi^) = “ 1 )- factorising into cyclo¬ 

tomic polynomials, prove that Q(X) divides PiX). Conclude that !!/>; d - J 
is always an integer. 

(b) Consider the « x « matrix A whose (/,y)th entry is the Gaussian polynomial 

«/ 

[y-iJ- 

Compute det A to obtain part (a) again. 
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Here, for m '> r^ the Gaussian polynomial is defined as 


Note that 


m 

r 

m 

r 


jX"' - PCX'”"' - 1) • ■ ■ - 1) 

(X'- -\){X’-'^ -1)- - •(;r- 1) 


m — 1 
r-1 




m — \ 
r 


It seems from looking at ^p{X) for prime p as though the coefficients of the cyclo- 
tomic polynomials ^d(X) for any d are all 0, 1 or -1. However, the following rather 
amazing fact was discovered by Schur. His proof uses a consequence of a deep result 
about prime numbers known as the prime number theorem. The prime number the¬ 
orem tells us that ;r(x) - x/log(x) as x oo. Here ;r(x) denotes the number of 
primes until x. The reader does not need to be familiar with the piime number theo¬ 
rem but is urged to take on faith the consequence of it that for any constant c, there 
is n such that 7^(2^) > cn. 


Proposition 1. Every integer occurs as a coefficient of some cyclotomic polyno¬ 
mial. 

Proof. First, we claim that for any integer t > 2, there are primes p\ < P2< ''' <Pt 
such that PI + P 2 > A- Suppose this is not true. Then, for some / > 2, every set of t 
primes p\ < • - ■ <Pt satisfies p\ + P2 < Pt- So, 2pi < pt- Therefore, the number of 
primes between 2^ and 2^+^ for any k is less than t. So, ;r(2^) < kt. This contra¬ 
dicts the prime number theorem as noted above. Hence, it is indeed true that for any 
integer t > 2, there are primes p\ < P2< ''' <Pt such that p\ P2> Pt- 

Now, let us fix any odd t > 2. We shall demonstrate that both and —t-\-2 

occur as coefficents. This will prove that all negative integers occur as coefficients. 
Then, using the fact that for an odd m > 1, d> 2 m( 2 f) = we can conclude 

that all integers are coefficients. 

Consider now primes p\ < P2 < ''' < Pt such that p\ + P2 > Pt- Write Pt = p 
for simplicity. Let n = p\ • - pt and let us write modulo Since X^ - 1 

= Yld/n ^d{X), and since p\ + P2> Pt, we have 

J 1 _ YPi 

= n Vrr - - Jt'’') • • • (1 - x”-) 

/=1 

= (i+...+A'P)(l -X''!-X'’')mod 

Therefore, the coefficients of X^ and XP~^ are 1 - t and 2-1, respectively. This 
completes the proof. Note that in the proof we have used the fact that if P{X) 
= (1 - X^)Q{X) for a polynomial QiX), then Q(X) = P{X)(l + X" -F + • • • 
-!-•••) modulo any X^. 


Exercise 4. 

(a) Let A = (aij) be a matrix in GL{n,Z), i.e., both A and have inte¬ 
ger entries. Consider the polynomials Pi(X) = 0 < i < n. 
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Prove that any integral polynomial of degree at most n is an integral linear 
combination of the Pi{X). In particular, if ao,. .. ,an € Q are distinct, show that 
any rational polynomial of degree at most n is of the form S/Lo 
for some Ai e Q. 


(b) Prove that I + A'H- ■ ■ • +X'’ = ( ” / ‘ Conclude 

that 'Zjoo ( 




. This is known 


as Binet’s formula. Further, cornpute 2^- 


n - i 


Remrak4. It is easily seen by induction that 2/>o y i / 

Fibonacci number ^, 7 + 1 . 

As remarked earlier, even for a polynomial of degree 2 (like + 1) it is unknown 
whether it takes infinitely many prime values. A general conjecture (Bouniakowsky, 
Schinzel and Sierpinski) in this context is: 

A nonconstant irreducible integral polynomial whose coefficients have 
no nontrivial common factor always takes on a prime value. 

We end with an open question which is typical of many number-theoretic ques¬ 
tions—a statement which can be understood by the proverbial layman but an answer 
which proves elusive to this day to professional mathematicians. For any irreducible, 
monic, integral polynomial P{X), define its Mahler measure to be M(P) 
= Max(|a/|, 1), where the product is over the roots of P. The following is an 

easy exercise. 

Exercise 5. M(P) = 1 if and only if P is cyclotomic. 

D H Lehmer posed the following question; 

Does there exist C > 0 such that M{P) > 1 -f C for all noncyclotomic 
(irreducible) polynomials P? 

This is expected to have an affirmative answer and, indeed, Lehmer’s calculations 
indicate that the smallest possible value of M(P) 7 ^ 1 is 1.176280821..., which 
occurs for the polynomial 

P(X) = + x'^ - x'' - X*^ - X^ - X^^ - + X + \. 

Lehmer’s question can be formulated in terms of discrete subgroups of Lie groups. 
One may not be able to predict when it can be answered but it is more or less certain 
that one will need tools involving deep mathematics. 
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Prime Representing Quadratics 


N V Tejaswi 


This chapter gives a proof of the result contained in the remark by R Tandon in 
Chapter 9. In fact, the converse of that statement is also true. This result was proved 
by Rabinowitz and Frobenius around 1912. A much simpler proof was given by 
Ayoub and Chowla in the Journal of Number Theory 13, 443-445 (1981). We believe 
the proof given here is new and is simpler than any of the available ones’. 

We use the notation contained in Chapter 9 with the additional observation that 
equivalent forms represent the same set of numbers. Let p be a prime with p = 3 
(mod 4) and n = (p + l)/4. We have: 

Theorem 1. The class number of forms with discriminant —p is 1, i.e., h{—p) = 1, 
if and only if for each x, 0 < x < n — 1, + x + a 2 is a prime number. 


Proof. Suppose there exists an integer b, 0 < b < n - 1 = (p — 3)/4 such that 
b^ b + n\s not a prime. Then there is a prime q such that 

b'^ 3- b + n = aq, 


with q^ < b^ b (p + l)/4. We have 


4q^ < {2b+\f+p< 


^ 2(p-3) 


+ 1 



2 


i.e.. 


and 


Q < 


P+1 
4 ’ 


4aq = {2b + \r + p. 


' After this article had been submitted to Resonance, I came to know that Frobenius proof closely resem¬ 
bles this proof. 


7/ 
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Consider the quadratic forms 

f{x,y) = x^ + xy + ny^ and g{x, y) = ax^{2b\)xy-h qy^. 

Both have discriminant equal to —p. Since the class number is 1, both these forms 
should be equivalent, and hence should represent the same set ot integers. Clearly, q 
is representable by g(x, y), (take X = 0, y = 1). But ^ is not representable by f{x,y). 
This follows from y 7 ^ 0, for, if y = 0 then q would be a square, and for y 7 ^ 0 we have 

fix, y) = ^{{2x + yf + py^) > 

while ^ < (p + 1 )/4. 

For the converse, suppose that h{-p) > 2; note that p > 1 since h{—3) = 
/?(—7) = 1 . Then there exists a reduced form 

2 2 

g(x, y) = ax -y bxycy 

with discriminant —p which is not equivalent to the (reduced) form 

f{x,y) = x^ + xy-yny^. 

From the definition of reduced quadratic forms we have that a,c > \ and \b\ < 
a < yjp/3. Further note that b is odd and hence b = 2b' -y \ for some integer b'. 
Clearly, /?' < n - 1 (as p > 7) and we have 

b'^ -y b' + n = ac, 

which shows that for x = b', x^ -y x -y n is not a prime number, thereby proving the 
converse. 

The following problem appeared in the 26^*^ International Mathematical Olympiad 
in 1986. 

Pr oble m. Let n be a natural number. If + k -y n is a prime number for 0 < /c < 
[y/n/3] show that k^ -y k -y n is 3. prime for 0 < /c < n — 2. 

In view of this we can restate the above theorem as 

Theorem 1'. The class number of form s with discriminant -p is 1, i.e.,/?(-p) = 1, 
if and only if for each x, 0 < x < [y/n/3], x^ + x + n is a prime number. 

j 

Acknowledgement: I thank C S Yogananda for suggesting the problem and for 
pointing out some errors in the first draft of this chapter. 
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The Congruent Number Problem 


V Chandrasekar 

In Mathematics, especially number theory, one often comes across problems 
which arise naturally and are easy to pose, but whose solutions require very 
sophisticated methods. What is known as The Congruent Number Problem’ is 
one such. Its statement is very simple and the problem dates back to antiq¬ 
uity, but it was only recently that a breakthrough was made, thanks to current 
developments in the Arithmetic of elliptic curves, an area of intense research in 
number theory. 


Introduction 

A positive integer n is called a congruent number if there exists a right-angled trian¬ 
gle whose sides are rational numbers and whose area is the given number n. 

If we represent the sides of such a triangle by X, Y, Z, with Z as the hypotenuse, 
then by our definition, a positive integer n is a congruent number if and only if the 
two equations 

9 9 9 zy 

x^ + Y^ = z^, - 5 - = « 

have a solution with Z, Y, Z all rational numbers. 

Examples. 

1. Consider the right-angled triangle with sides X = 3, Y =4 and Z = 5. Its 
area n is XY12 = 6 , so 6 is a congruent number. Here we are lucky to find a 
suitable triangle for the number 6 whose sides are actually integers. It will be 
seen that this is in general an exceptional circumstance. 

2. Consider the triangle with sides 3/2, 20/3 and 41/6. This is a right-angled 
triangle (!) and its area is 5. Therefore 5 is a congruent number. 

Question. Does there exist a right-angled triangle with integral sides and area 
equal to 5? 
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Question. Is 1 a congruent number? (There is a lot of history behind this which 
will be narrated below.) 

One can generate congruent numbers at will by making use of the identity 

_ y2)2 + (2XY)^ = + Y'^f 

which corresponds to the right-angled triangle with sides X'^-Y'^, 2XY and hypote¬ 
nuse We substitute our choice of integer values for X and Y and obtain the 

congruent number n = XY{X^ — Y^) . For example, X = 3, T = 2 yields the triangle 
with sides 5, 12, 13 and area 30. So 30 is a congruent number. For more examples 
refer to Box 14.1. 

Now any positive integer n can be written as = u^v, where v has no square 
factors (v is a ‘squarefree integer’). It is clear that is a congruent number if and 
only if V is so; the right-angled triangle for v can be obtained from the corresponding 
one for ri, if it exists, by scaling it down by a factor of u. (Remember that we allow the 
side lengths to take fractional values!) So when deciding whether n is congruent or 
not, we may assume that« is a squarefree integer. This will be done in what follows. 


Box 14.1 Generating Congruent Numbers 

Here p, q are arbitrary positive integers of opposite parity (that is, p + ^ is 
odd), the congruent number n is the squarefree part of pq{p'^ - q^), and the 
sides of the triangle are proportional to p^ - q^, 2pq, p^ + q^ ^ 


Serial number 

P 


n 

Sides of the triangle 

1 

3 

2 

30 

5, 12, 13 

2 

4 

3 

21 

7/2, 12, 25/2 

3 

5 

4 

5 

3/2, 20/3, 41/6 

4 

9 

4 

65 

65/6, 12, 97/6 

5 

25 

16 

41 

40/3, 123/20, 881/6 


Now we are ready to formulate: 

The Congruent Number Problem. Given a positive integer n, is there a simple 
criterion which enables us to decide whether or not n is congruent? 

j 

A few remarks are in order. To start with, if we restrict the sides of the triangle to 
integer values only, the question can be settled, at least in theory, in a finite number 
of steps. To see why, recall the equations 

9 9 9 XT 

x^ + Y^ = z^, - 7 - = «- 

Since X and Y are now integers, X and Y both divide 2n. So to see if a solution 
exists, we let X run through the set of divisors of 2/i, let Y = 2n/X and check 
whether X^ -L is a square integer. Thus the problem can be settled in a routine 
manner. For example, we can easily verify that there is no integral solution for the 
case n = 5. (Note, however, that we do know that 5 is a congruent number.) 
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But once we allow the sides to have rational values, the problem acquires an 
entirely different status. There is no obvious starting point, unlike the case of integer 
solutions discussed above. One could endlessly chum out congruent numbers follow¬ 
ing the method in Box 14.1 without being certain when a given number n (or n x m^, 
for some integer m) will appear on the list. Continuing in this way would exhaust 
one s computing resources, not to mention one’s patience! Also, this procedure is of 
no avail if n is not a congruent number. 

To appreciate this better, consider the following right-angled triangle with area 
101 which was found by Bastien in 1914. This triangle has sides 

^ _ 711024064578955010000 

“ 118171431852779451900’ 

^ _ 3967272806033495003922 

" 118171431852779451900 ’ 

and hypotenuse 

^ _ 2 X 2015242462949760001961 
118171431852779451900 ' 

This is known to be the smallest solution (in terms of the sizes of the numerator and 
denominator) corresponding to the congruent number 101! The serial number of this 
triangle in the list in Box 14.1 would exceed 10^^! 

The above considerations force us to look for a more indirect approach in our 
search for a criterion for characterizing congruent numbers. 

Here we have yet another instance of a problem in number theory which is simple 
to state, yet has hidden depths. There have been instances when the solutions of such 
problems have emerged only centuries after being posed. In such instances, a lot of 
deep and beautiful mathematics gets generated as a result. A striking example from 
recent times is the proof of Fermat’s last theorem by Andrew Wiles in 1995, which 
uses a mind-boggling variety of techniques from several fields in current mathemat¬ 
ical research. 

We shall see how the congruent number problem falls into this category by giving 
a brief account of its history and the concepts and techniques that were used in the 
solution of this problem which.is deceptively so simple to state. 


Brief History 

The congruent number problem makes its earliest appearance in an Arab manuscript 
traced to the tenth century (c 972 AD). In his classic History of the Theory of Num¬ 
bers, Vol 2 (Diophantine Analysis), Dickson quotes Woepeck’s view that there is no 
indication that the Arabs knew Diophantus prior to the translation by Aboul Wafi 
(998 AD), but they may well have come across the problem from the Hindus who 
were already acquainted with his work. The Arabs figured out that the following 
numbers are congruent: 5, 6, 14, 15, 21, 30, 34, 65, 70, 110, 154, 190 and so on. In 
fact, their list contains ten congruent numbers greater than 100, for example, 10374. 

The scene later shifts to Pisa, where Leonardo Pisano (better known as Fibonacci), 
by virtue of his position as a mathematical expert in his native city, is presented to 
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the Emperor Frederic II. The king’s scholars challenge him to find three rational 
numbers whose squares form an arithmetic progression with common difference 5. 
This is equivalent to finding integers X, Y, Z, T, with T ^ 0, such that 7^ — X'^. 
= Z^ — = 5T'^, and this in turn reduces to finding a right triangle with rational 

sides 

Z + Z Z - Z 27 

rjn ’ rji ^ ^ 

and area 5; in other words, to the question of whether 5 is a congruent number or not. 
Leonardo addressed the general problem in his memoir Liber Quadratorum (1225), 
which was lost to the world till it was found and published by Prince Boncom- 
paign in the year 1856. In addition to showing that 5 and 7 are congruent numbers 
(the triangles have sides 3/2,20/3,41/6 and 35/12,24/5,337/60 respectively), he 
also states without proof that no congruent number can be a square, or equivalently 
that 1 is not a congruent number. 

The proof of this statement had to wait for four centuries. Eventually it led to 
Fermat’s discovery of his method of infinite descent, which was to have a profound 
effect on subsequent developments in arithmetic, or number theory as we now call it. 

Fermat had been in correspondence with many of his contemporaries regarding the 
existence of a right-angled triangle with rational sides and a square area. An explicit 
reference to the application of his technique to prove that this is impossible appears 
in his letter to Huygens in 1659, where he states: ''As ordinary methods, such as are 
found in the books, are inadequate to proving such difficult propositions, I discovered 
at last a most singular method ... which 1 call the infinite descent. At first I used 
it only to prove negative assertions such as ... "there is no right angled triangle 
in numbers whose area is a square”. To apply it to affirmative questions is much 
harder, so that, when I had to prove that "Every prime of the form 4n + I is a sum 
of two squares", I found myself in a sorry plight. But at last such questions proved 
amenable to my method.” (We infer that the technique of infinite descent had its first 
application in number theory to the problem of congruent numbers.) Continuing, 
Fermat gives a cryptic description of his method: "If the area of such a triangle were 
a square, then there would also be a smaller one with the same property, and so 
on, which is impossible, ... ”. He adds that to explain how his method works would 
make his discourse too long, as the whole mystery of his method lay there. To quote 
Weil, "Fortunately, just for once, he (Fermat) had found room for this mystery in the 
margin of the very last proposition of Diophahtus”. 

Before reproducing Fermat’s proof we prove the following: 

Proposition 1 . Let X, 7, Z be the sides of an integer-sided right-angled triangle, 
with Z the hypotenuse, such that X, 7, Z have no common factors. Then there exist, 
relatively prime integers p, q such that p A q is odd, (Z, 7) = [p^ - Ipq] and 
Z^p'^A q^. 

Proof. Clearly X and 7 cannot be both even, as they have no common factors. 
Both cannot be odd, for in this case both X'^ and 7^ would be 1 modulo 4, implying 
that Z^ = 2 (mod 4); but this is absurd as no square is of the form 2 (mod 4). Thus 
one of them, say X, is odd and the other, 7, is even. It follows that Z is odd and that 
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Z + X,Z — X are both even. Therefore (Z + Z)/2 and (Z — A")/2 are integers; 
indeed they are coprime, because X and Z are themselves coprime. 

Since = Z'^ — X'^, we obtain: 

/Y^ _ Z + X Z-X 

ViJ ~~ T~' 

By the unique factorization property of the integers, each factor on the right-side 
must be a square. Thus (Z -f Z)/2 = p^, (Z — Z)/2 = with p and q coprime. 
Solving, we obtain 

X=p^-q^, Y = 2pq, Z = p'^ + q^. 

Since X is odd, p + ^ is odd. 


Fermat’s Legacy 

We now reproduce Fermat’s proof by the method of descent in the following: 
Theorem. 1 is not a congruent number. 


Proof. Suppose, on the contrary, that 1 is a congruent number; i.e., there exists a 
right-angled triangle with integral sides whose area is a square integer. In view of 
Proposition 1, its sides must be of the form 2pq,p^ — q^,p^ -E q^ with p > q > 0, 
p + q odd and (p, q) = \ . 

Since the area (= pq{p — q){p -E q)) is a square integer and the numbers p, 
q, p — q, p + q art mutually coprime, it follows that each of these numbers is a 
square integer. We write 

p = q = y^, p-\-q = u^, p-q = v^. 


Since u and v are odd and coprime, it follows that the gcd of w -E v and m - v is 2. But 
now we have 

2y^ = 2q = = {u v){u — v). 

Arguing as in Proposition 1, we see that there exist integers r,s such that {u -E v, 
u—v) = (2r^,45^) or (m-Ev, u—v) = (4r^, 25^). The former case leads to m = r^-E25^, 
V = — 2s^ and therefore to 



-E 

2 



-e45'^. 


Hence r^, 2s^, x are the sides of a right-angled triangle with area (rs)^ and hypotenuse 
X = y/p < p^ q^ (the hypotenuse of the triangle with which we started). The case 
w -E V = 4r^, u — V = 2s^ is dealt with in a similar fashion. 

So, starting from a right-angled triangle with integral sides whose area is a square 
integer, we have produced another triangle of the same type with a smaller hypotenuse 
than the original triangle. Clearly this process can be repeated. But this gives rise to 
an infinite decreasing sequence of positive integers—a clear absurdity. (This is the 





78 Number Theory 


central principle behind infinite descent.) We are thus led to a contradiction and we 

conclude that 1 is not a congruent number. 

The non-congruent nature of the number 1 is of special interest because it shows 
that there is no non-trivial solution to the equation = Z'^, which in turn 

implies Fermat’s last theorem (‘The equation + T” = has no non-trivial 
solutions in integers for n > 2’) for the case n = 4! 

In the following two propositions we prove the claims made above. 

Proposition 2. A number n is congruent if and only if there exists a rational num¬ 
ber a such that + n and - n are both squares of rational numbers. 

Proof. Let n be a congruent number and let X, Y, Z be rational numbers satisfying 

. . . XY 

-f- 

Then -F ± 2XY = Z^ ± 4n, so 



So if we take a = Z/2, then a is rational and + n and - n are both squares of 
rational numbers. 

For the converse, let a be a rational number such that -^n and — n are squares 
of rational numbers. Let 

X = \/a^ n -f y = y/a^ -F n -F - n, 

and _ _ 

Z = Vz2 + y2 = = 2a. 

Then X, Y, Z are the sides of a right-angled triangle with rational sides and area 
XY/2 = ((a^ + n)- (a^ - n))/2 = n. 

Proposition 3. If there are non-zero integers X, Y, Z such that A"* — = Z^, 

then 1 is a congruent number. 

Proof. Write the equation in the form 

= y4 + 

Using Proposition 1, we deduce that there exist integers p, q such that 
and y = - q^. But this leads to 



So p^/q^ is a rational number such that p^/q^ + 1 and p^/q^ - 1 are squares of 
rational numbers. In other words 1 is a congruent number. 

Combining Propositions 2 and 3 with the fact that I is a non-congruent number, 
we deduce Fermat’s last theorem for n = 4. 









The Congruent Number Problem 79 


Before closing this section, it is fitting to quote Weil’s lavish praise of Fermat and 
his justly-famous method: '‘‘‘The true breakthrough came in 1922 with MordelVs cel¬ 
ebrated paper; here, if Fermat's name does not occur, the use of the words "infinite 
descent" shows that Mordell was well aware of his indebtedness to his remote pre¬ 
decessor. Since then the theory of elliptic curves, and its generalizations to curves 
of higher genus and to abelian varieties, has been one of the main topics of modern 
number theory. Fermat's name, and his method of infinite descent, are indissolubly 
bound with it; they promise to remain so in the future". 


Congruent Numbers and Elliptic Curves 

Congruent numbers continued to excite the curiosity of number theorists over the 
years. Their congruence properties have been investigated and tables of such num¬ 
bers constructed. Some classes of numbers have also been identified as congruent 
numbers. To cite an example, a result due to Heegner and Birch shows that if is a 
prime number of the form 5 (mod 8) or of the form 7 (mod 8) then n is a. congruent 
number. (See Box 14.2) 

But what is ultimately sought is a simple and complete characterization of all con¬ 
gruent numbers; in other words, an algorithm which will quickly determine whether 
a given natural number n is congruent or not. 


Box 14.2 Some Classes of Congruent Numbers 

This box displays some results given in the paper by K Feng [5]. It charac¬ 
terises some classes of congruent and non-congruent numbers in terms of 
their divisibility properties. 

To illustrate. Gross’s result states that if an integer n is squarefree and has at 
most two prime factors of the form 5,6 or 1 (mod 8), then n is a congruent 
number. 

If p and q are odd primes, then the Legendre symbol {p/q) is 1 if p is a 
quadratic residue modulo q (that is, if the equation = p (mod q) has a 
solution), else —1. 

In the following account, n is taken to be a squarefree integer. The symbol 
‘CN’ means ‘congruent number’, while ‘Non-CN’ means ‘non-congruent 
number’. p,q,r denote distinct primes and p, refers to an arbitrary prime 
congruent to i mod 8. 

For CN 

• n = 2p3 (Heegner 1952, Birch 1968). 

• n = p 5 , p'] (Stevens 1975). 

• n = p^q^ = 5,6,7 (mod 8), 0 < w, v < 1 (B Gross 1985). 

• n = 2p3P5,2p5P7. 

Contd... 
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Contd... 

• n - Ipxpi, with (p\/pi) = -1 (Monsky 1990). 

• n = 2 /)ip 3 , with (pi/ps) = -1. 

For Non-CN 

• n = P3,p3q3,'2p5,2p5q5 (Genocchi 1855). 

• n ■=■ 2p, with p = 9 (mod 16) (Bastien 1913). 

• n = p\p 3 , with ip\/p3) = -1 (Lagrange 1974). 

• n = 2p\p5, with (p\/p5) = -L 

• n = p\p 3 q\,mth (pi/ps) = (ps/^l) = -L 


As it happened, the search for such an algorithm was made possible by relating 
the congruent number problem to the arithmetic of elliptic curves. 

This connection is established as follows. From Proposition 2 we know that a 
number n is congruent means there exists a rational square, say such that + n 
and - n are both rational squares. This implies that is a. rational square, 

say v^; or equivalently that - rP'u^ = u^v^. Setting x = and y = uv wq arrive 
at the equation y'^ = — n^x. Thus if « is a congruent number, we obtain a rational 

point (x, y) on the curve represented by the equation - n^x. 

Now the curves corresponding to the equation y^ = x^ - ri^x are examples of 
what are known as elliptic curves. The arithmetic of these curves has been a central 
topic of research in Number Theory over the years. In view of the above connection, 
it was natural to expect that the results relating to elliptic curves would be able to 
settle the congruent number problem. This expectation was realized when J Tunnell 
succeeded in finding a simple algorithm for the problem. (See Box 14.3 for a brief 
outline of the logical steps involved in Tunnell’s method.) 

Let the reader be reassured that to apply the algorithm one does not need to know 
anything about elliptic curves, modular forms, liftings or L-functions which are (to 
name a few) some of the concepts and techniques which lie at the basis of Tunnell’s 
work! 

In what follows, #S denotes the number of elements of a set S. 

TUNNELL’S Theorem (1983). Let « be a squarefree congruent number (that is, n 
is the area of a right-angled triangle with rational sides). Define Bn,Cn. as 
follows: 

An = #{(x,y,z) G I « = 2x^ -1- y^ + 32z^}, 

Bn = #{(x,y,:^) G I « = 2x^ + y^ + 8^^), 

Cn = #{(x,y,:^) G Z^ I « = 8x^ + 2y^ + 64^^), 

Dn = #l(x,y,z) G Z^ I « = 8x^ + 2y^ +\6z^}. 

Then: 

(A) An = Bn/2 if n is odd; and 

(B) Cn = Dn/2 if n is even. 
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If the Birch-Swinnerton Dyer conjecture is true, then, conversely, these equalities 
imply that « is a congruent number. 


Box 14.3 Elliptic Curves and the Congruent Number Problem 


For each natural nuniber n, let denote the elliptic curve represented by 
the equation y'^ = — n^x. Then we have the following correspondence 

between the set of right-angled triangles with rational sides and area n and 
the set of rational points on En- Let the sides be A, B, C where A, B, C are 
rational and A < B < C, and let_(^, y) be a rational point on En such that: 
(a) X is the square of a rational number, (b) the denominator of x is even, 
(c) the numerator of x has no common factor with n. The correspondence 
is given as follows: 


(x, ±y) —(Vx -t- n - Vx — n, Vx E n + Vx — n, 2y/x) , 


(A,B,C) 


T’ 8 ) 


It can be shown by means of the above bijection that a number n is congru¬ 
ent if and only if there exist infinitely many rational solutions (x, y) on the 
elliptic curve En. 

To each elliptic curve En, there is associated an important number L{En), 
which we shall not attempt to define. It is known (this is the Coates-Wiles 
Theorem) that if En has infinitely many rational solutions, then L(En) = 0. 
Combining this with the remark in the previous paragraph we deduce the 
following: If L{En) is not zero, then n is a non-congruent number. 

The converse statement, namely that L{En) = 0 implies the existence of 
infinitely many rational points on En (in other words, that L{En) = 0 
implies that n is congruent) would follow from a famous conjecture due to 
Birch and Swinnerton-Dyer. (This conjecture has been made for all elliptic 
curves, not just for the En defined above.) 

Now Tunnell’s work can be summarized in one line; he has found an expres¬ 
sion for L{En) which is of the form 


LiEn) = 


C X {An - Bnl2), if n is odd, 
C X (Cn - Dnl2), if n is even. 


Here C is a non-zero number, and An, Bn,Cn, Dn are the quantities defined 
in the statement of Tunnell’s theorem. 

The justification of Tunnell’s algorithm follows from the above mentioned 
facts. 


Observe that Tunnell’s algorithm helps one to establish whether a given number n 
is non-congruent. 
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Examples. 

1. Let /7 = 1; then = 2, so equation (A) is not valid. We conclude that 1 

is not a congruent number. 

2. We show similarly that 2 and 3 are not congruent numbers. 

3. Let n be squarefree, odd and congruent to 5 or 7 modulo 8. Since 2x + y can 
never be congruent to 5 or 7 modulo 8, both cardinalities in (A) are 0 and hence 
the condition is satisfied. If the Birch-Swinnerton Dyer conjecture were true, 
we would be able to conclude that any such n is a congruent number. (There 
is supportive argument for this statement from the tables and the vanishing of 
the so called L-value of the corresponding elliptic curve.) 

In particular, 157 would be a congruent number. This is in fact true. A proof of 
this fact is furnished by the right-angled triangle whose sides x, y, z, displayed below, 
were computed by Don Zagier. Again, this is the smallest solution for the area 157! 
The sides are X, T where 

6803298487826435051217540 

- 411340519227716149383203 ’ 
411340519227716149383203 

" 21666555693714761309610 ’ 
and the hypotenuse is Z where 

^ _ 224403517704336969924557513090674863160948472041 
“ ”891233226892885958802553517896716570016480830 ’ 

A natural question on the part of the reader would concern the appropriateness of 
the word ‘congruent’ in the definition of congruent number. As to that, one cannot 
do better than to quote Richard Guy: ''Congruent Numbers are perhaps confusingly 
named". 

But, after all, what’s there in a name? 

Suggested Reading 

[1] R K Guy. Unsolved Problems in Number Theory. Springer-Verlag, 1981. 

[2] N Koblitz. Introduction to Elliptic Curves and Modular Forms. Springer-Verlag, 
1984. 

[3] J Tunnell. A classical Diophantine problem and modular forms of weight 3/2. 
Inventiones Math. 12. 323-33, 1983. 

[4] A Weil. Number Theory: An Approach Through History. Birkhauser, 1984. 

[5] K Feng. Non-congruent numbers, odd graphs and the B-S-D conjecture. Acta 
Arithmetica\ LXXV 1, 1996. 
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Fermat’s Last Theorem 

A Theorem at Lxist! 

C S Yogananda 

After more than three centuries of effort by some of the best mathematicians, 
Gerhard Frey, J-P Serre, Ken Ribet and Andrew Wiles have finally succeeded 
in proving Fermat’s assertion that the equation has no solutions 

in non-zero integers if n > 3. Each of the four mathematicians made a decisive 
contribution, with Wiles delivering the coup de grace. The proof, as it finally 
came to be, is in some sense a triumph for Fermat. 

When Pierre de Fermat died in 1665, he had not published a single mathematical 
work (except for an anonymous appendix to a book written by a colleague). His 
mathematical discoveries were contained in his correspondence with other mathe¬ 
maticians of his time, notably, Pascal, Frenicle de Bessy and Father Mersenne. He 
also left behind a few unpublished manuscripts and marginal notes in the books he 
studied. We have to be grateful to his son Samuel for whatever we know of Fermat’s 
work. Samuel de Fermat went through his father’s papers and books in addition to 
soliciting letters written by his father from his con*espondents in order to publish 
them. Among Fermat’s possessions was a copy of the Latin translation, by Bachet, 
of Diophantus’ Arithmetic in which Fermat had made a number of marginal notes. 

The first work Samuel chose to publish, in 1670, was a new edition of Bachet’s 
Diophantus with an appendix containing forty eight marginal notes made by Fermat. 
The second of these notes appears alongside problem 8 in Book II of Arithmetic: 
“... given a number which is square, write it as a sum of two other squares” . Fermat’s 
note states, in Latin, that ”on the other hand, it is impossible for a cube to be written 
as a sum of two cubes or a fourth power to be written as sum of two fourth powers or, 
in general, for any number which is a power greater than the second to be written as 
a sum of two like powers. I have a truly marvellous demonstration of this proposition 
which this margin is too narrow to contain”. Thus, it was in 1670 that the world 
learnt of what has come to be termed as Fermat’s Last Theorem (FLT): The equation 

A"' + = z" 
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has no solutions in non-zero integers if n > 3. Fermat himself had given a proof 
of this assertion for n = 4 using infinite descent, a method he invented, and Euler 
proved the case, n = 3. Thus, to prove FLT we need to show that has 

no solutions in non-zero integers whenever p is a prime greater than 3 (do you see 
why?). 

After more than three centuries of effort by some of the best mathematicians, Ger¬ 
hard Frey, J-P Serre, Ken Ribet and Andrew Wiles have finally succeeded in proving 
Fermat’s assertion, each of them making a decisive contribution, with Wiles deliver¬ 
ing the coup de grace. The proof, as it finally came to be, is in some sense a triumph 
for Fermat. Elliptic curves and infinite descent play significant roles; it was Fermat 
who pioneered the use of elliptic curves in solving diophantine equations, and it is to 
him that we owe the method of infinite descent. 

Diophantine Equations 

The chief work of Diophantus of Alexandria (c. 250 A.D) known to us is the Arith¬ 
metic, a treatise in thirteen books, or Elements, of which only the first six have sur¬ 
vived. This work consists of about 150 problems, each of which asks for the solution 
of a given set of algebraic equations in positive rational numbers, and so equations 
for which we seek integer (or rational) solutions are referred to as diophantine equa¬ 
tions. The most familiar example we know is -1- whose solutions are 

Pythagorean triples', (3, 4, 5), (5, 12, 13) are examples of such triples. If, instead, 
we ask for solutions in integers of -f 7^ = 3Z^, we get an example of a diophan¬ 
tine equation for which there are no solutions in non-zero integers. (To see this, first 
observe that we may assume X, 7, Z to be pairwise relatively prime, by cancelling 
common factors, if any; and that any square when divided by 3 leaves remainder 0 
or 1.) In fact, it is an interesting exercise to characterize the set of natural numbers m 
for which the equation X^ -j- 7^ = mZ^ has no solutions in non-zero integers. 

To understand the role of geometry in solving diophantine equations, let us con¬ 
sider the equation AT^ -I- 7^ = Z^. How do we characterize all solutions (in integers) 
of this equation? We could assume again that X, 7, Z is a primitive solution, i.e., 
X,Y,Z are pairwise relatively prime. Dividing by Z^ and putting XjZ - x and 
7/Z = y we get -I- = i, that is to say, we get a rational point (a point both 

of whose coordinates are rational numbers), (x, y), on the unit circle centered at the 
origin. Conversely, a rational point on the circle x^-l-y^ = 1 will give us a (primitive) 
Pythagorean triple. So, our problem reduces to finding all rational points on the unit 
circle. We do this by drawing a line with rational slope passing through the point 
(-1, 0). This line will meet the circle at one more point and we claim that this point 
is also rational. I shall leave it to you to figure out why it is so. (You need to use the 
fact that if one root of a quadratic equation with rational coefficients is rational then 
the other root is also rational.) This way we obtain all rational points on the circle. 
Put t = tan 6/2 in the familiar parametrisation of the circle, (cos 6, sin 9). Then we 
get the well-known characterisation of the Pythagorean triples: if m and n, m > n, 
are integers of opposite parity then the numbers 


m^ - 2mn, m^ -f n^ 
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History of FLT 

• 1640, Fermat himself proved the case n — A 

• 1770, Euler proved the case n = 3; (Gauss also gave a proof). 

• 1823, Sophie Germain proved the first case of FLT — first case of 

FLT holds if there is no solution for for which p does 

not divide the product XY Z — for a class of primes, Sophie Germain 
primes — primes p such that 2/? + 1 is also a prime. 

• 1825, Dirichlet, Legendre proved FLT for n = 5. 

• 1832, Dirichlet treated successfully the case n = \4. 

• 1839, Lame proved the case n = 1. 

• 1847, Kummer proved FLT in the case when the exponent is a regular 
prime. But it is not known, even today, whether there are infinitely 
many Sophie Germain primes or regular primes. 

• 1983, Fallings gave a proof of Mordell’s conjecture. 

• 1986, Frey-Ribet-Serre: Shimura-Taniyama-Weil conjecture 
implies FLT. 

• 1994, Andrew Wiles: proof of S-T-W conjecture for semistable ellip¬ 
tic curves. 


form a primitive Pythagorean triple and every primitive Pythagorean triple arises this 
way. 

This method can be used to find all rational points on a conic section whose defin¬ 
ing equation has rational coefficients, once we are able to find one such point. 

Elliptic Curves 

Consider the following classical problems. 

(i) Find all n such that the sum of the squares of the first n natural numbers is a 
square. That is, we have to find natural numbers n and m such that 

= n{n + \ )(2n + l)/6. 

(ii) (Diophantus) Find three rational right triangles of equal area. 

Let A denote the area of the right triangle with sides a(= p — q ), b (= 2pq) 
and c(= p^ + q^)\ thus A = pq{p^ - q^). Then if we put x - pjq we get a 
rational point {pjq, 1 on the curve 

9 9 

Ay = X - X. 

Conversely, if {^ajh, c/J) is a rational point on this curve then the right triangle 
with di^cP' — iP^flPc and 2ad/bc as legs also has area equal to A. 

(iii) (From an Arab manuscript dated before the 9th century) Given a natural num¬ 
ber A 7 , find a rational number u, such that both iP -f- n and -n are squares (of 
rational numbers). 
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What is elliptic about elliptic curves? 

Ellipses are not elliptic curves! Elliptic curves are so called because it was in 
connection with the problem of computing arc lengths of ellipses that they 
were first studied systematically. When we compute the arc of a circle, we 

have to integrate the function 1 /y(1 - x^), which we do in terms of sine 
and cos functions. The trignometric functions are therefore called circular 
functions. Similarly, to compute the arc length of an ellipse, we have to 
integrate functions of the form 

\/V[{\-xbO-k^xy. 

This integral cannot be computed using circular functions and mathemati¬ 
cians worked on this problem for many years before Abel and Jacobi, inde¬ 
pendently introduced elliptic functions to compute such integrals. Just as 
sin and cos satisfy = 1, the elliptic functions satisfy an equation of 

the form = f{x) where f(x) is a cubic. 


If such a u can be found then n is called a congruent number. A number n being 
congruent is equivalent to the existence of a right triangle with rational sides and 
area n (see Chapter 12). 

Let n be a congruent number and let u be such that ip- -y n = and u^ — n = b^. 
Multiplying the two equations together we get 

iP — n^ = • 

Multiplying by iP throughout to get 

6 2 2 / L n2 

u — n u = {abu) . 

Putting ir = X and abu = y we get a rational point on the curve, E, defined by the 
equation 


Exercise. Conversely, if (x,y) is a rational point on E such that x is a rational 
square and has an even denominator, then n (whose square appears as the coefficient 
of x) is a congruent number. 

In each of the above problems, we were led to consider equations of the form 
where /(x) is a cubic polynomial in x with rational coefficients and 
distinct roots. Such equations define elliptic curves. We could think of elliptic curves 
as the set of all rational/real/complex solutions of such equations. The set of all 
complex solutions of an elliptic curve can be identified with the points on a torus. 
The figures below (Figure 15.1) show what the real and complex points on an elliptic 
curve look like. 

Finding rational points on an elliptic curve turns out to be a difficult problem 
and though many deep results have been proved (one of them by Andrew Wiles 
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Figure 15.1 Typical illustration depicting how the real/complex points on an elliptic 
curve look like. 

along with John Coates), a lot remains to be done in this area. The study of elliptic 
curves is currently a very active field of research involving many different areas of 
mathematics. 

If we try to imitate the method we used for a conic to get more rational points 
from one such point we are stuck. This is because generally, a line meets a cubic 
curve at three points and we cannot conclude that the other points of intersection are 
rational. That is, if one root of a cubic equation with rational coefficients is rational, 
the other two roots could be iiTational; they could be conjugate surds, for instance. 
What is true is that if you draw the line Joining two rational points, then the third 
point where this line meets the cubic will also be a rational point. Thus, we can ‘add’ 
two rational points to get a third rational point. It turns out that we could take the 
‘point at infinity’ as the identity or the ‘zero’ element and obtain a structure of a 
group (in fact, a commutative group) on the set of rational points of an elliptic curve 
by declaring the sum of three collinear points to be zero; the inverse or ‘negative’ 
of the point (x, y) is the point (x, -y). Thus, to add two points P and Q join them 
by a straight line, find the third point of intersection of the line with the curve and 
reflect it in the x-axis to get a point, R, on the curve which will then be the ‘sum’ of 
P and Q. 

Exercise. Consider the elliptic curve, E, defined by the equation y^ = ax^ + bx^ 
+ cx + i/. Obtain an expression for the coordinates X3, y3 of the sum of the two points 
P = (xi,yi) and Q = (x2,y2) on E in terms of xi, X 2 ,yi,y 2 - 
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Hint: If P is not equal to Q, X 3 = -x\ - X 2 - {b/a) + (y 2 - y\)'^/a(x 2 - 
and if P = Q, xs = -2x\ - {b/a) + {f'{x\))^/ai2y\)^ where f{x) denotes the 
cubic. 

The structure of a group on the set of rational points of an elliptic curve pro¬ 
vides us with a powerful tool to study diophantine equations. For instance, in prob¬ 
lem (ii), if we get one rational point then we could ‘double’ (i.e., draw a tangent 
at that point) it to get one more point and then add these two to get yet another 
point, and so on. In fact, this is what Fermat used to get more solutions to the prob¬ 
lem (even Diophantus used this procedure but he gave only three rational points). 
In the congruent number problem, it turns out that the double of any rational point 
which is not of order 2 is such that the x-coordinate is a square number with an even 
denominator. 

The method we used to show the non-existence of solutions of = 3Z^ by 

showing that the equation has no solutions modulo i is a standard method we use in 
studying diophantine equations. Assume that the equation has integer coefficients by 
clearing the denominators, if necessary. We reduce the equation modulo a prime p by 
replacing the coefficients of the equation by their remainders when divided by p and 
consider the set of solutions of the reduced equation in the finite field {0, 1,2,..., 
p — 1). If, for example, we find a prime for which there are no solutions for the 
reduced equation, it follows immediately that the original equation has no rational 
roots. 

Consider an elliptic curve E defined by y^ = /(^)- Except for a finite set of 
primes depending on the cubic /(x), the reduced equation will also define an elliptic 
curve. In fact, the exceptional set of primes is precisely the set of prime divisors 
of the discriminant of the cubic /(x). For a prime p not dividing the discriminant, 
let Np denote the number of points of E modulo p, i.e., the number of pairs (x, y), 
with X, y in {0,1,2,..., p — 1}, satisfying the equation modulo p. Define integers 
Up by 


Np = p-y \ - Qp. 


These apS could be positive or negative and Hasse proved the following inequality 
in 1930: 


\ap\ < 2/p. 

These numbers contain a lot of information about the rational points of the elliptic 
curve and there are many conjectures concerning their properties among which the 
Birch-Swinnerton-Dyer conjecture and the Shimura-Taniyama-Weil conjecture are 
the most important. 

The content of the Shimura-Taniyama-Wei! (S-T-W) conjecture is that these a p’s 
are the Eourier coefficients of a cusp form (of weight 2 and a certain level N). The 
definition ot cusp forms is beyond the scope of this chapter and we content our¬ 
selves by saying that they are certain functions on the upper half-plane (please see 
Suggested Reading at the end). Elliptic curves for which the afs satisfy the S-T-W 
conjecture are called modular elliptic curves. 
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Frey Elliptic Curve and Fermat’s Last Theorem 

The study of rational points on higher degree curves witnessed a breakthrough in 
1983 when Gerd Faltings proved a conjecture of Mordell. As a corollary, it stated 
that the curve + 7'^ = 1 has only finitely many rational points if n > 5, which 
means that there would be at most finitely many solutions to the Fermat equation 


-\-Y^ = Z^ 


The general feeling among mathematicians following this was one of satisfaction 
since there was no reason or heuristic basis as to why FLT should be true; at most 
finitely many solutions was good enough. 

But FLT bounced back soon after in 1985, when Gerhard Frey linked a counter 
example of FLT, if there is one, with an elliptic curve which did not seem to satisfy 
the S-T-W conjecture! Frey’s was a simple but very ingenious idea: if, for some 
prime p > 3, there are non-zero integers u, v, w such that uP + vP = wP, then consider 
the elliptic curve, now referred to as the Frey curve, 

y'^ = x{x + uP){x — vP). 

Thus for the first time, FLT for any exponent was connected with a cubic curve 
instead of the higher degree curve which the equation itself defines. 

Then things started happening fast and in the summer of 1986, building on the 
work of Frey and Serre, Ribet succeeded in proving that S-T-W implies FLT by 
showing that the Frey curve could not be modular. Now, FLT was not just a curiosity 
but was related to a deep conjecture; if it were not true and we had a counter example, 
the Frey curve would be sticking out like a sore thumb! 

Soon after he heard of Ribet’s result, Andrew Wiles went to work on the S-T-W 
conjecture in the late summer of 1986. After working hard on it for seven years, 
during which time even his closest friends did not get to know what he was up to. 
Wiles stunned the mathematical world by claiming that he had proved the FLT by 
proving a particular case of the S-T-W conjecture, the case of semi-stable elliptic 
curves. He made the announcement at the end of a series of lectures at the Isaac 
Newton Institute in Cambridge, England on the morning of Wednesday, June 23, 
1993. But experts checking his proof found many gaps of which he could overcome 
all but one. It is to the credit of Wiles that he did not let this setback deter him. 
Rather, encouraged and mathematically supported by his students and close friends, 
notably Henri Darmon, Fred Diamond and Richard Taylor, he circumvented the gap 
in September 1994. His paper, along with another one of his jointly with Richard 
Taylor, occupies one whole issue of the leading journal Annals of Mathematics, 142 
(1995). It should be remarked that the theorem Wiles proved has a very significant 
result with far-reaching consequences and FLT follows as a simple corollary. 

Apparently, Fermat’s favourite target for his problems and challenges were the 
English mathematicians; after all, he was French! Thus, it is fitting that his most 
famous challenge has been answered by Wiles, an Englishman, though it took a 
while (A Wiles!) coming! 
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Some Unsolved Problems in Number Theory 

Progress Made in Recent Times 

K Ramachandra 

The beauty of the theory of numbers is that it poses so many simple-looking 
problems, most of which remain unsolved even today. Many of these problems 
have come down to us from ancient times, indicating the age-old 
fascination that human beings have felt for numbers. We list a few of these 
problems below, describing some known results and indicating the progress 
made in recent times. 


The Infinitude of Primes 

It is easy to show that the list 2,3,5,7,11,... of primes does not terminate. The 
biggest prime known explicitly today has more than 10^ digits! Now consider pairs 
of primes that differ by 2, for instance (3,5), (5,7), (11,13), (17,19),... . These are 
the so-called twin primes. It is not known as of today whether the list of twin primes 
terminates or not. It is known that the sum 1/3-1- 1/5-1- 1/11 + 1/17 + •• • = ^ 1/p 
taken over all primes p such that p -H 2 is prime is finite (indeed, the sum, known 
as Bran's constant, can be computed to a fair degree of accuracy), but this does 
not prove that there are only finitely many such primes. (It is clearly possible for a 
sum of infinitely many positive numbers to be finite; for instance, this happens with 
the sets ||, • • • } ^^d {j, |, • }• Ancient Greeks believed this 

was impossible. The well-known paradoxes of Zeno are related to this observation.) 
The best that we know today is that the list of pairs (p, q) of primes with 

/ 1 

0<p-^<clnp (^“4 

does not terminate. This is a very deep result due to H Maier of Germany. (Actually 
his constant c is slightly less than 1/4.) We are very far from this result-for, say, 
c = 1/100. 
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Another question deals with the number /r(x) of primes p below x. It was noticed 
by Legendre, Gauss, Riemann and others that 7 r(x) is roughly equal to x/ In x; this is 
equivalent to saying that the nth prime is roughly equal to n In n. Chebychev showed 
that there exist constants a, b such that 


X 


a- 


In X 


< ;r(x) < b 


In X 


for all X. Using the methods of complex variables, Hadamard and de la Vallee Poussin 
proved independently in the 1890’s that 


• ; r ~r': = '■ 

x-^oo x/ ln(x) 

Instead of 7 r(x), it is nicer to deal with the function Q(x) which counts the prime p 
with the weight In p; that is, Q(x) = Xp<x P- proved around the turn of the 

century that 

\Q(x) - x\ < X y’' 

for all X > 10*^^ and a certain absolute positive constant h. The precise value of h is 
not important. One of the deepest results in prime number theory is the theorem that 

the term can be replaced by 


^(In x)^/^(ln In x) 


This result is due to the Soviet mathematician I M Vinogradov. 


Additive Prime Number Theory 

In 1742 Goldbach asked, in a letter to Euler, whether every even number from 6 
onwards can be expressed as a sum of two odd primes. The answer to this ques¬ 
tion is unknown even today! The achievements in this problem have a very long his¬ 
tory. Using the so-called ‘circle method’ pioneered by Ramanujan-Hardy, Hardy and 
Littlewood showed that if the hypothesis formulated below holds true, then every odd 
number from some point onwards can be expressed as a sum of 3 odd primes. 

The hypothesis is stated in terms of the following function p defined on the set of 
positive integers: 

r 1, for n = 1; 

p(/ 7 ) = < 0, if n is divisible by the square of a prime; 

(^(-1)^, if n is the product of k distinct primes. 

Let <3, b be positive integers, and \eih > 3/4 be a constant. The hypothesis then states 

that the following inequality holds for all x > N(a, b, /?), where N is some function 
that depends only on a, b, h: 

I p(an - 1 - b) 

This hypothesis is open as of today. It is considered very difficult to prove, even in 
the special case a = b=l,h=\ - 10“^^^. 
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The Circle Method 

(The ‘circle method’ was developed by Ramanujan and Hardy while they 
were working on the partition problem. The problem is to find an asymptot¬ 
ically accurate formula for p{n), the number of partitions of n or the num¬ 
ber of ways that n can be written as an unordered sum of positive integers 
(p(l) = 1, p(2) = 2, p(3) = 3, p(4) = 5, ...), It has been known from the 
time of Euler that 

00 00 

H (1 - = Tj 

J=\ «= 1 

Let f{z) denote the infinite product on the left side. The singularities of 
f{z) are the roots of unity and lie densely on the unit circle |^| = 1; thus 
f{z) has the unit circle as its circle of convergence. Using Cauchy’s residue 
theorem, we obtain 


p{n) = 


27ti 


O 


\z\=r 


f{l) 


dz. 


for 0 < r < 1. Thus the problem of estimating p(n) has been converted into 
one of estimating an integral. The beautiful and amazingly productive idea 
pioneered by Ramanujan and Hardy was to estimate the integral by identify¬ 
ing the points where ‘most’ of the contribution comes from; these are clearly 
the points on |z| = r that lie ‘close’ to the poles of / (z). The practical details 
are formidable, but what is of significance is that the method, originally con¬ 
ceived to tackle the partition problem, has turned out to be applicable to a 
large class of related problems—for instance, Waring’s problem.) 


However, in 1937 Vinogradov proved the same result without having to use any 
unproved hypothesis. A recent result in the direction of Goldbach’s conjecture is the 
one by O Ramare: Every positive even number can be expressed as a sum of not more 
than 6 primes. Another result by the author and his colleagues A Sankarayanan and 
K Srinivas is the following: let gn denote the nth even number expressible as a sum 
of 2 odd primes (gi = 6, g2 = 8, g 3 = 10,...). We do not know whether the range 
of g exhausts the even numbers beyond 6, but the following is now known: 

(gn+1 - < kgn for all n, 


where k is a positive constant independent of n. 


Waring’s Problem 

Let k be any natural number greater than 1. More than two centuries back, E Waring 
conjectured the following. Let g(k) = 2^ -f [1.5^] - 2 and write g for gik). Then 
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th 

every positive integer n can be expressed as a sum of g or fewer positive k pow¬ 
ers; that is, for all n £ N there exist non-negative integers xi, X2,..., such that 
« = + ^2 + • • • + Xg. It is not too hard to check that the number ^ = 2^[1.5^] - 1 

cannot be expressed as a sum of fewer than ^.positive k^^ powers; that is, the equation 

+ ^2 + '" + ^g-l 

has no solution in non-negative integers x/. {Example: Let k = 3; then g = 8 -f 3 — 2 
= 9 and ^ = (8 x 3) — .1 = 23. Since 23 < 3^, to express 23 as a sum of positive 
cubes we must use only the summands 1 and 8, and since 23 = (2 x 8) + (7 x 1), 
we require at least 9 such summands. Thus 23 cannot be expressed as a sum of fewer 
than 9 positive cubes.) Thus g is the most economical number of summands. 

The current status of the problem is as follows: There exists an absolute positive 
constant C such that Waring's conjecture is true for all k > C. The proof derives 
from the ideas of Ramanujan, Hardy, Littlewood, Vinogradov, Dickson, Ridout and 
Mahler and is very complicated, running to hundreds of pages. It should be men¬ 
tioned that the proof only establishes the existence of C and gives no clue as to its 
magnitude; no C, however large, can be calculated by the method of proof. 

Problems on Irrationality 

Consider the zeta function ^(t) defined for real numbers / > 1 as follows: 

^(t) = 1 grand achievements of the century is the proof that 

is irrational. {An irrational number is one that is not expressible as a ratio of two 
non-zero integers. Related to the idea of irrationality is the notion of transcendence. 
A number is algebraic if it is the root of a polynomial with integral coefficients; else 
it is transcendental. Examples of algebraic irrationals are V2, ^ and ^10 -I- y/Ti, 
and examples of transcendental numbers are 7t, e and In 2 (here e = 2.71828... is 
Euler’s number). The proof that a given number is transcendental can be extremely 
difficult.) The proof is due to R Apery. What happens when t is an odd positive 
integer greater than 3 is open. Strangely, a great deal is known when t is an even 
positive integer. Indeed, it is known that the value of f (?) is a rational multiple of 
whenever t is an even positive integer. (This has been known since the time of Euler.) 
This immediately implies that ({t) is irrational, indeed transcendental, when t is an 
even positive integer. The paucity of positive conclusions for the case when t is an 
odd positive integer is extremely curious. 

Much the same can be said for Euler’s constant y defined thus: 

y = lim (l + :^ + ^H- h-j-lnn. 

n-^cc \ 2 3 n J 

Amazingly, it is not known whether y is rational or not. 

The transcendency of numbers such as ;r -I- In 2 was first proved by A Baker. These 
are deep results. 
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Concluding Remarks 


It appears that there is no dearth of attractive problems. What is needed are solutions! 
What has been solved is very little and what remains to be solved is vast. In figurative 
terms, what has been solved can be likened to aji egg-shell, and what remains to be 
solved to the infinite space surrounding it. 


Addendum to “Some Unsolved Problems in 

Number Theory” 

{Resonance, May 1997) 


1 . 


2 . 


S S Pillai—The omission of the name S S Pillai (Siva Sankara- 
narayana Pillai) in connection with Waring’s problem is very seri¬ 
ous. In a series of papers, Pillai proved that if k > 6 and further if 
(3^ -I- l)/(2^ - 1) < [1-5^] + 1 then Waring’s conjecture is cor¬ 
rect for that k. Around the same time (but a little later) L E Dick¬ 
son proved this with k >1 and (3^ -f l)/(2^ - 1) < [1-5^] + 1. 
The inequality (3^-fl)/(2^-l) < [1.5^] -Hi was proved for 
all integers exceeding a certain constant C (same C as in the para¬ 
graph on Waring’s problem) by K Mahler. The history of this dis¬ 
covery is very well explained in Introduction to the Theory of Num¬ 
bers by G H Hardy and E M Wright (see notes at the end of the 
chapter XXI). For another treasure house of information regarding 
priority of Pillai’s work see K Chandrasekharan, S S Pillai (obitu¬ 
ary), J. Indian Math. Soc., Vol.l5, pp 1-10, 1951. Regarding Pillai’s 
achievements I mention the following: when I was in the Institute 
for Advanced Studies, Princeton, USA, during 1970-71, I noticed 


in the Institute Library a book by G H Hardy where he places 
Pillai as the greatest Indian mathematician after Srinivasa Ramanu¬ 
jan. Waring’s conjecture was proved for k = 5 by Chen-Jing- 
Run (around 1970) and for k = 4 by R Balasubramanian, J- 
M Deshouillers and F Dress in 1989. Cases k = 2 and 3 were 
disposed off (by simpler methods) by Lagrange and Wieferich 
respectively. About Pillai I have the following comment: Once I 
was talking to a responsible Indian specialist dealing with His¬ 
tory of Mathematics. I was very surprised when I came to know 
that he had not heard of Pillai at all. I can account for it 
as follows. Pillai was very unassuming; he was a member of 
the Indian Mathematical Society alright; but he was not a fel¬ 
low of any of the academies and he had no publicity whatso¬ 
ever amongst mathematicians who had not looked at the book by 
G H Hardy and E M Wright mentioned earlier. 

The equation I fA{an -{■ b) < under the section ‘Additive Prime 


Number Theory’ should read 


2 p{an + b) 
\<n<x 


< x' 


Cont ... 
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3. A comment on The Circle Method in the box on page 78: 

The function f{z) is analytic in |^| < 1 and it does not exist anywhere 
in \z\ > 1. (So the terminology poles of f{z) is not correct). We have to 
make r a suitable function of n but still less than 1. Then decompose this 
circle into small bits in a particular way and obtain asymptotics of each bit. 
The cumulative effect of adding all these asymptotics will give the Hardy- 
Ramanujan formula for partitions. Actually Ramanujan in his first letter 
(this letter was written from the Madras Port Trust) to Hardy mentions (see 
equation 1.14 of Twelve Lectures) that the integer q(n) defined by 



(note that LHS is the product 




is the integer nearest to 



When questioned about this, he wrote in a letter that it is “not the integer 
nearest to but this main term plus ... ”. (Compare this main term with the 
first term of the Hardy-Ramanujan-Rademacher formula for p{n)). 


K Ramachandra 


K Ramachandra 
Honorary Visiting Professor 
National Institute of Advanced Studies 
Indian Institute of Science Campus 
Bangalore 560 012 
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